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1 Introduction 


This document describes the REX3, part of the Newport (“the least graphics you'll ever need”) graphics sub- 
system. 


1.1 Part Name and Number 


Part Name: REX3 

SGI Part Number: 099-9005-001 

Vendor: LSI Logic Corporation 

Vendor Part Number: L1A9040 

Technology: LC300K (0.6 micron CMOS gate array) 

Base Wafer: L300415P 

Package: 304 MQUAD 

Gate Count:149,000 equivalent gates, including 5.7K bits dual port RAM. 


1.2 General Description 


ҺЕХЗ is the raster engine for Newport graphics. The basic operation of the raster engine is to draw 
lines and spans. Various packed formats of host DMA are also supported. It is based on some of the con- 
cepts of REX1, i.e there is no dedicated geometry engine for graphics. Instead, the hosts floating point unit 
is used as the geometry engine. Like REX1, Z buffering is done by the host in system memory. REXS's reg- 
ister interface has been optimized for minimum host writes to execute primitives. REX3 has various pixel 
formats to accommodate a low cost 8 bits/pixel system as well as a 24 bits/pixel system. Besides the pixel 
planes REX3 supports CID, PUP and Overlay planes. Also, in order to achieve high frame buffer writing 
bandwidth, the frame buffer is architected as an 8 way interleave combined with a Y axis interleave. There 
are two sets of RGBA iterators so 2 shaded pixels/clock are generated. For flat filled spans, four pixels/clock 
are generated. In order to bound the package size to less than 304 pins, the frame buffer data is byte seri- 
alized for each of the eight interleaves. This data is deserialized by RB2s’ before writing to the frame buffer. 
In order to limit the number of gates in REX3, the read/write formaters and the logicop functions have been 
incorporated into RB2s. 


1.3 Features 


° 33 MHz GIO64 Bus Interface 

° 66 MHz Isotropic 8 way interleaved frame Buffer Interface 

e 33MHz Display Bus Interface with synchronous / asynchronous / burst mode slave support 
*  Bresenham line iterators 

° RGB and Cl anti-aliased Bresenham lines 

*  Bi-endian support 

° Software Z buffer 

° Blend function 

e  1280x 1024 resolution 

*  Upto 76Hz screen refresh 

° Upgradable from 8 pixel + 2PUP + 2CID planes to 24 pixel + 8 Overlay(or 4+4) + 2PUP + 2CID planes 
* Optional Express Video ready 
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e  GenLock capability 
1.4 Newport Architecture 
Newport graphics is made of the following major components: 


REX3 

RB2 

Frame buffer 
RO1 

XMAP9 
CMAP 

VC2 
RAMDAC 
Static Ram 


CONBDAPWNH > 


The graphics pipeline begins with the host writing into the REXS registers to execute primitives. REX3 
transforms these primitives into screen coordinates and writes the data via RB2 into the frame buffer. The 
frame buffer is made of Vrams (2MBit) in an 8 way interleave configuration. The serial ports of the frame 
buffer are read into RO1 and passed into XMAP9 which manipulates the data for multi mode screen. XMAP9 
passes the data onto the CMAP which consists of high speed static ram for Color Index modes. When in 
RGB mode, the data goes through other static ram within CMAP that is normally linearly mapped, although 
for image processing applications it does not have to be linearly mapped. The output of CMAP is fed into 
the RAMDAC for display to the screen. The gamma correction tables reside in the RAMDAC. The output of 
the CMAPs is also fed back to XMAP9 and output onto the Video port. Video data can also be accepted 
from the video port and output to the CMAP to display on the graphics monitor. VC2 provides all the relevant 
timing for the graphics sub system. 

A block diagram of the Newport graphics sub-system is shown in Figure 1. 


1.5 REX3 Architecture 


Figure 2 shows the top level block diagram of REX3. REX3 could be viewed as three logical blocks. The 
first block, which interface to the host bus (GIO64) is the GIO block. REX3 supports both GIO64 and GIO32 
protocol, the default being GlO64. The GIO64 bus may be either 64 or 32 bits wide. This block receives 
commands for all the primitives that REX3 draws as well as provide host access to other devices in the dis- 
play and video (optional) subsystems. REX3 is implemented as a GIO64 bus slave which decodes 
addresses on the GIO64 bus to detect accesses to its own registers, or those within the Video subsystem. 
Commands and data to and from the Display subsystem are sent over the Display Control Bus. The REX3 
is the master of the Display Control Bus. The second block is the iterator block. This block generates the 
frame buffer addresses, interpolates the colors and provides masking and various patterning capabilities. 
The pixel address generation for lines is done by Bresenham iterators. This block also handles the coverage 
values for anti - aliased lines and does the swizzle for the frame buffer interleaving. The third section is the 
memory controller and pixel pipe. There are four instances of the memory controller and pixel pipe. This 
block has the frame buffer controller as well as the CID checking, color compare, dither and Blend functions. 
The GIO and lterator sections operate at 33MHz and the memory controller and the pixel pipe operates at 
66MHz. The GIO interface with the host is via a fifo which is 64 wide and 32 deep. The high water mark on 
the GIO fifo is programmable. The Iterator section communicates to the memory controller and pixel pipe 
via 4 bank fifos. Each bank fifo consists of one write and two read fifos. For screen to screen copy operations 
the Iterator section generates a read into the read bank fifos апа swizzles the data before writing it into the 
write bank fifos. The memory controller operates each of the 4 banks independent of each other. The mem- 
ory is cycled іп 4 clocks (60nS) for page mode operations. Figure 3 shows the internal data path of REX3. 
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1.6 Performance 





Га (аша | 


Table 1: REX3 Performance 
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FIGURE 2. REX3 top-level block diagram 
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FIGURE 3. REX3 Internal Data Path 
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2 Device Interface 


2.1 Pin Diagram 


FIGURE 4. REX3 Pin Diagram 





P AD[63:0] -«— — —» |І» VRAM RAS [A:D] 
PASN————» REX3 |—~-» vRAM ADOR [A:DI[8:0] 
РВЕАр-_ у | VRAM DTOE N [A:D] 
GRXDLY | » VRAM DSF1 [A:D] 
MEMDLY | VRAM_WBWE_N [A:D] 

FIFO INT N | æ VRAM CAS [A:D] 0 

VV. INT. N | » VRAM CAS [A:D] 1 
| œ RB2 SEL [A:D][2:0] 
VIDEO INT N |4 j RB2 DATA [A:D] 0[7:0] 
VERT. INT. N 4 » RB2 DATA [А:0] 1[7:0] 

-4— — УС TX REQ 

|4 — УС SET TSC 

| ___„ RO Y DISP[1:0] 


14--->- DCB DATA[7:0] 


SLOT NUMBER[1:0] — — — >] ---->- DCB ADDR[$3:0] 
1—— > DCB CRS[2:0] 


GIORESET № — — p 
GIO64CLK — — ——» ------ DCB RW М 

1—— >» DCB CS М 

— DCB_ACK_N 


247 I/O pins 44—19 y ATPG/PLL Pins 
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2.2 Pin Descriptions 


The following tables list for each REX3 pin the assertion level, direction (І, О, I/O), LSI IO cell type, followed 
by a brief functional description. 


TABLE 2. GIO64 Bus Interface 





Pin Name Level Type Function 

Р AD[63:0] NA l/O(BD8TRPU) 64-bit pipelined Address/data bus 

P AS N L I(TLCHT) Asserted during an Address cycle on the GIO bus. 

P READ NA I(TLCHT) Indicates the direction of the data transfer during Address cycles. 


After the Address cycle, P READ is driven low to indicate that an 
active bus cycle is taking place. The GIO64 bus master preempts 
a transaction by asserting P READ. 


GRXDLY H O(BT8RP) When asserted, this signal indicates that for read data cycles, the 
ВЕХЗ is not returning valid data on the P AD bus. For write 
cycles, the REX3 asserts GRXDLY when the next transfer on the 
on the non-pipelined side of the GIO64 bus must be stalled (one 
more word will be accepted by the REX3). 


MEMDLY H I(TLCHT) When deasserted during write data cycles, this signal indicates 
that the host is presenting valid data on the GIOb4 bus. When 
asserted during read data cycles, this signal indicates that the 
host cannot accept data from the REX3 during the next cycle. 





FIFO_INT_N L O(BT4OD) REX3 GFIFO/BFIFO above/below interrupt (Open Drain). 

VV_INT N L O(BT4OD) VC2 Vertical retrace or Kaleidoscope Video Option interrupt 
(Open Drain). 

SLOT_NUMBER[1:0] МА I(TLCHT) Address bits [23:22] of the Newport graphics board. Address bits 
[31:24] = “0001_1111”. 

GIORESET_N L (ТІ.СНТ) Synchronous reset. 

GIO64CLK NA IICMOS) Positive GIO64 bus clock. All GIO64 bus signals are clocked on 





the rising edge of this signal. 


TABLE 3. VRAM/RB2/REORG Interface 














Pin Name Type Function 

VRAM_RAS [A:D] О(В4) VRAM RAS, for the four memory banks[A:D] 

VRAM ADDR [A:D][8:0] O(BT4RP) VRAM Address bus 

VRAM_DTOE_N [A:D] O(BT4RP) VRAM Transfer Enable / Output Enable. 

VRAM DSF1 [A:D] O(BT4RP) VRAM special function control pin. 

VRAM WBWE N [A:D] O(B4) VRAM bank write enable (active low). 

VRAM CAS [A:D] 0 O(BT4RP) VRAM CAS for the even halves of the four memory banks 
VRAM_CAS [A:D]_1 O(BT4RP) VRAM CAS for the odd halves of the four memory banks 

RB2 SEL [A:D][2:0] O(B4) Operation selects for the four memory banks. Encoded as 


follows: 000 NOOP Аб 
001 Write (4 components), lower pixel into OLY planes 
010 Write higher pixel into OLY planes 
011 Load write mask and partial DRAWMODE1 Regs 
00 Read (4 components), lower pixel of OLY planes 
01 Read higher pixel of OLY planes Р 
10 Read lower pixel CID bits {for CID checking) 
111 Read higher CID bits (for 











ID checking) 
RB2 DATA [A:D] 0[7:0] /O(BD8TRPU) RB2 data for the even halves of the four memory banks 
RB2 DATA [A:D] 1[7:0] l/O(BD8TRPU) RB2 data for the odd halves of the four memory banks 
VC TX REQ (ТІ.СНТ) Transfer request 
VC SET Т5С (ТІ.СНТ) Set top of scan. 
RO_Y_DISP[1:0] O(BT4RP) Scanline (modulo-4) for staggering the frame buffer 
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TABLE 4. Display Control Bus Interface 














Pin Name Level Type Function 

DCB DATA[7:0] NA l/O(BD8TRPU) Data read from (DCB RW М = 1) or written (DCB RW М = 0) to the 
Display Control Bus slave devices. 

DCB ADDR[S3:0] NA O(BT4RP) Display Control Bus slave device Address. 

DCB_CRS[2:0] NA O(BT8RP) Display Control Bus slave device command or register select field. 

DCB RW N NA O(BT8RP) Read/Write direction signal. 

DCB_CS_N NA O(BT4RP) Display Control Bus command strobe, indicating that valid 
DCB ADDR, DCB _ CRS, DCB_RW_N and, for write transfers, 
DCB_DATA are on the bus. 

DCB_ACK_N L I(IBUFN) Acknowledge signal for Display Control Bus slaves to handshake 
transfers with the REX3. When asserted during write cycles, 


DCB_ACK N indicates that the slave device has accepted the 
DCB_DATA, and that the next Display Control Bus cycle may begin. 
During read cycles, the Display Control Bus slave ‘asserts 
DCB_ACK to indicate that it has placed valid data on the DCB_DATA 





lines. 
TABLE 5. Miscellaneous Back-End Pins 
Pin Name Level Type Function 
VERT_INT_N L \(IBUFN) Vertical retrace/sync interrupt from VC2 
VIDEO_INT_N L \(IBUFN) Interrupt from Express Video option 


TABLE 6. ASIC Mandatory PLL and Test Pins 














Pin Name Level Type Function 

JTAG_TDI NA I(TLCHTU) Scan Test Data In 

Шай NA кыы брат driven low. Driven high for normal operation,” тр-Поре 
JTAG_TCK NA I(TLCHTU) Scan Test Clock 

JTAG TDO NA O(B2) Scan Test Data/Parametric NAND tree/PLL Test Clock Out 

TEI NA (ТІ.СНМ) ГО pin tristate enable. When driven low, all bidirectional pins and 


tri-state unidirectional pins are forced into high impedance state. 
Driven high for normal operation. 


ТР[1:0] МА (ТІ.СНТ) PLL/Scan Test Mode. Encoded ав follows: 
00 Normal Operation. VCO ripple counter output -- JTAG_TDO 
01 PLL bypass mode. Scan chain output -> JTAG_TDO 
10 PLL bypass mode. Parametric NAND tree -> JTAG_TDO 
11 Scan mode. JTAG_TCK drives all flops. Scan chain 
output -> JTAG_TDO. VCO is disabled for IDD test 











PLL_RESET_N L (ТІ.СНТ) PLL Reset. Тһе loop filter output is grounded when asserted 
LP1 NA O(DDRVO) PLL Charge Pump Output / Loop Filter Input 

LP2 NA I/O(RDDRVPD) PLL VCO input / Loop Filter Output 

AVDD NA \(RDDRV) PLL Analog VDD 

AVSS NA K(RDDRV) PLL Analog VSS 

AGND NA O(RDDRVO) PLL Analog Ground 
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2.3 VHDL Description 


This section describes the device level interface to the REX3 as a VHDL entity. 


entitiy REX3 is 


port( 

--GIO64 Bus interface (74 pins) 
P AD : inout mvl7w vector (63 downto 0); 
P AS М: іп mvl7w; 
P READ : in mvl7w; 
GRXDLY : out mvl7w: 
MEMDLY : in mvl7w; 
FIFO_INT_N: out mvl7w; 
VV. INT. N:out mvi7w; 
SLOT NUMBER : in mvl7w. vector (1 downto 0); 
GIORESET М : in mvi7w; 
GIO64CLK : in mvl7w; 

--VRAM/RB2/REORG Interface (140 pins) 
VRAM RAS A: out mvl7w; 
VRAM ADDR A : out mvl7w. | vector (8 downto 0); 
VRAM DTOE М А : out mvl7w; 
VRAM_DSF1_A: out mvl7w; 
VRAM WBWE N A:out mvl7w; 
VRAM CAS А 0 : out mvl7w; 
VRAM CAS A 1:0ut mvl7w; 
RB2 SEL A: out mvl7w vector (2 downto 0); 
RB2 DATA A 0 : inout mvl7w vector (7 downto 0); 
RAM HAS B. : inout mvl/w_vector (7 downto 0); 
VRAM RAS В : out mvl7w; 
VRAM ADDR | B : out mvl7w. - vector (8 downto 0); 
VRAM DTOE М В : out mvl7w; 
VRAM_DSF1_B: out mvl7w; 
VRAM_WBWE_N_B: out mvl7w: 
VRAM CAS В 0: out mvi7w; 
VRAM CAS В 1 : out mvl7w: 
RB2_SEL_B: out mvl7w_vector (2 downto 0); 
RB2_DATA_B_0: inout mvl7w_vector (7 downto 0); 
RB2 DATA В 1: inout mvl7w. vector (7 downto 0); 
VRAM RAS C : out mvi7w; 
VRAM ADDR C : out mvl7w ' vector (8 downto 0); 
VRAM DTOE М С : out mvl7w; 
VRAM DSF1 C: out mvi7w; 
VRAM WBWE N C: out mvl7w; 
VRAM CAS C 0 : out mvl7w; 
VRAM CAS С 1 :out mvl7w: 
RB2 SEL C: out mvl7w. vector (2 downto 0); 
RB2 DATA C 0 : inout mvl7w vector (7 downto 0); 
RB2 DATA | 5 1 : inout mvl7w vector (7 downto 0); 
VRAM RAS D : out mvl7w; 
VRAM ADDR D : out mvl7w ' vector (8 downto 0); 
VRAM DTOE М D : out mvi7w; 
VRAM DSF1 D: out mvl7w; 
VRAM_WBWE_N_D: out mvl7w; 
VRAM CAS Бб 0 : out mvl7w; 
VRAM CAS _D_ 1 : out mvl7w; 
RB2 SEL D : out mvl7w ' vector (2 downto 0); 
RB2 DATA D 0 : inout mvl7w. vector (7 downto 0); 
RB2 DATA D 1 : inout mvI7w. vector (7 downto 0); 
VC TX REQ : in mvi7w; 
VC SET TSC : in mvl7w; 
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RO Y DISP : out mvl/w_vector (1 downto 0 
--Display Control Bus Interface (18 pins) 
DCB ADDR : out mvl7w vector (3 downto 0); 
DCB DATA : inout mvl7w. vector (7 downto 0); 
DCB CRS:out mvl7w vector (2 downto 0); 
DCB CS N:out mvl7w; 
DCB RW N:out mvl7w; 
DCB АСК М: іп mvl7w; 
--Miscellaneous Back End pins (2 pins) 
VERT_INT_N: in mvl7w; 
VIDEO_INT_N: іп mvl7w; 
--ASIC Mandatory pins (13 pins) 
TEI : in mvl7w; --External tri-state control 
JTAG TDI : in mvl7w; 
JTAG ТМ : in mvl7w; 
JTAG ТСК : in mvl7w; 
«ТАС ТОО: out mvl7w; 
TP : in mvl7w. vector (1 downto 0); 
PLL RESET N: in mvl7w; 
LP1 : out mvl7w; 
LP2 : in mvl7w; 
AGND : out mvl7w; 
AVSS : in mvl7w; 
AVDD : in mvl7w 


M— Z 


); 
end REX3; 
2.4 Package Pin Assignment 
The following list of package pin assignments is from the LSI Logic LBOND program. The REX3 is mounted 
in a 304 MQUAD cavity down package, and pins are numbered by LSI in a counter-clockwise manner when 


viewing the die. When mounted (cavity down) on the PC board, pins are also numbered in a counter-clock- 
wise fashion. Therefore, the printed circuit board pin number is equal to (305-LSI pin number). 


Pin Number Signal Name 


1 vdd 

2 _аа 10 
3 | ad 11 
4 | ad 12 
5 | ad 13 
6 р ad 14 
7 vss 

8 _аа 15 
9 _аа 16 
10 | ad 17 
11 | ad 18 
12 | ad 19 
13 V 

14 | ad 20 
15 | ad 21 
16 | ad 22 
17 р ad 23 
18 р ad 24 
19 vss 

20 gio64clkx 
21 pll_reset_n 
22 ірі 

23 Ip2 

24 agnd 

25 avdd 

26 avss 

27 | ad 25 
28 | ad 26 
29 | ad 27 
30 | ad 28 
31 р ad 29 
32 vdd 

33 | ad 30 
34 p. ad 31 
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35 p_as_n 

36 p_read 

37 p_memdly 

38 p_grxdly 

39 p_ad_32 

40 р аа 33 

41 vdd 

42 vss 

43 vss 

44 р аа 34 

45 р аа 35 

46 р аа 36 

47 р аа 37 

48 р аа 38 

49 vdd 

50 р аа 39 

51 p_ad 40 

52 p_ad_41 

53 р аа 42 

54 р аа 43 

55 vss 

56 p_ad 44 

57 р аа 45 

58 р аа 46 

59 р аа 47 

60 р аа 48 

61 vdd 

62 р аа 49 

63 р аа 50 

64 p. ad 51 

65 р ad 52 

66 p. аа 53 

67 vss 

68 р аа 54 

69 р аа 55 

70 р аа 56 

71 р аа 57 

72 р аа 58 

73 vdd 

74 p_ad_59 

75 р_аа 60 

76 vss 

77 vdd 

78 p. ad 61 

79 р ad 62 

80 р ad 63 

81 slot number 0 
82 slot number 1 
83 tp O 

84 ір 1 

85 video int n 

86 vert int n 

87 jtag tms 

88 jtag tdi 

89 jtag_tck 

90 tei 

91 jtag_tdo 

92 ro_y_disp_0 

93 ro y disp 1 

94 vss 

95 vc_tx_req 

96 vc_set_tsc 

97 rb2 data a 0 0 
98 rb2 data a O 1 
99 rb2 data a 0 2 
100 rb2 data a 0 3 
101 rb2 data a 0 4 
102 уда 

103 rb2 data a 0 5 
104 rb2 data a 0 6 
105 vss 

106 rb2 data a 0 7 
107 vram wbwe n a 
108 vram dtoe n a 
109 vram dsf1 a 
110 rb2 sel a 0 

111 rb2 sel a 1 

112 уда 

113 rb2 sel a 2 
114 vram addr a 0 
115 vram addr a 1 
116 vram addr a 2 
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117 vram_addr_a_3 
118 vss 

119 vdd 

120 vram_addr_a_4 
121 vram_addr_a_5 
122 vram_addr_a_6 
123 vram_addr_a_7 
124 vram_addr_a_8 
125 vss 

126 vram_ras_a 

127 vram_cas_a_0 
128 vram cas a 1 
129 rb2 data a 1 O0 
130 rb2 data a 1 1 
131 vdd 

132 rb2 data a 1 2 
133 rb2 data a 1 3 
134 rb2 data a 1 4 
135 rb2 data a 1 5 
136 rb2 data a 1 6 
137 vss 

138 rb2 data а 1 7 
139 rb2 data b 0 0 
140 уда 

141 rb2 data b O 1 
142 rb2 data b 0 2 
143 rb2 data b 0 3 
144 rb2 data b 0 4 
145 rb2 data b 0 5 
146 vss 

147 rb2 data b 0 6 
148 rb2 data b O 7 
149 vram wbwe n b 
150 vram dtoe n b 
151 vram dsf1 b 
152 уда 

153 vss 

154 rb2_sel_b 0 

155 rb2 sel b 1 

156 rb2 sel b 2 
157 vram addr b 0 
158 уда 

159 vram addr b 1 
160 vram addr b 2 
161 vram addr b 3 
162 vram addr b 4 
163 vram addr b 5 
164 vss 

165 vram_addr_b_6 
166 vram_addr_b_7 
167 vram_addr_b_8 
168 vram_ras_b 

169 vram_cas_b_0 
170 vdd 

171 vram cas b 1 
172 rb2 data b 1 0 
173 rb2 data b 1 1 
174 rb2 data b 1 2 
175 rb2 data b 1 3 
176 vss 

177 rb2 data b 1 4 
178 rb2 data b 1 5 
179 rb2 data b 1 6 
180 rb2 data b 1 7 
181 rb2 data c 0 0 
182 уда 

183 rb2 data c O 1 
184 rb2 data c 0 2 
185 rb2 data c 0 3 
186 rb2 data c 0 4 
187 rb2 data c 0 5 
188 vss 

189 rb2 data c 0 6 
190 rb2 data c 0 7 
191 vram wbwe n c 
192 vram dtoe n c 
193 vram dsf1 c 
194 уда 

195 vss 

196 rb2_sel_c_0 

197 rb2 sel c 1 
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198 rb2 sel c 2 

199 vram addr c 0 
200 vram addr c 1 
201 уда 

202 vram_addr_c_2 
203 vram_addr_c_3 
204 vram_addr_c_4 
205 vram_addr_c_5 
206 vram_addr_c_6 
207 vss 

208 vram_addr_c_7 
209 vram_addr_c_8 
210 vram_ras_c 

211 vram_cas_c_0 
212 vram_cas_c_1 
213 vdd 

214 rb2 data c 1 0 
215 rb2 data c 1 1 
216 rb2 data c 1 2 
217 rb2 data c 1 3 
218 rb2 data c 1 4 
219 vss 

220 rb2 data c 1 5 
221 rb2 data c 1 6 
222 rb2 data c 1 7 
223 rb2 data d 0 0 
224 rb2 data d O 1 
225 уда 

226 rb2 data d 0 2 
227 rb2 data d 0 3 
228 vss 

229 vdd 

230 rb2 data d 0 4 
231 rb2 data d 0 5 
232 rb2 data d 0 6 
233 rb2 data d 0 7 
234 vram wbwe n d 
235 vss 

236 vram_dtoe_n_d 
237 vram_dsf1_d 
238 rb2 sel d 0 
239 rb2 sel d 1 
240 rb2 sel d 2 
241 уда 

242 vram_addr_d_0 
243 vram addr d 1 
244 vram addr d 2 
245 vram addr d 3 
246 vss 

247 мат addr d . 
248 vram addr d 5 
249 vram addr d 6 
250 vram addr d 7 
251 vram addr d 8 
252 уда 

253 vram ras а 
254 vram cas d 0 
255 vram cas d 1 
256 rb2 data d 10 
257 rb2 data d 1 1 
258 rb2 data d 1 2 
259 vss 

260 rb2 data d 1 3 
261 rb2 data d 1 4 
262 rb2 data d 1 5 
263 rb2 data d 1 6 
264 rb2 data d 1 7 
265 уда 

266 vss 

267 dcb_data_0 
268 dcb_data_1 
269 dcb_data_2 
270 dcb_data_3 
271 dcb_data_4 
272 vdd 

273 dcb_data_5 
274 dcb_data_6 
275 dcb_data_7 
276 dcb_crs_0 

277 dcb crs 1 

278 vss 
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279 dcb_crs_2 
280 dcb_rw_n 
281 vdd 

282 dcb_cs_n 
283 dcb_addr_0 
284 dcb_addr_1 
285 dcb_addr_2 
286 dcb_addr_3 
287 vss 

288 асю аск n 
289 vv int n 
290 fifo int n 
291 vss 

292 gioreset_n 
293 p ad ! 

294 р ad 1 
295 p ad 2 
296 p ad з 
297 p ad 4 
298 vdd 

299 p ad 5 
300 p ad 6 
301 p ad 7 
302 p ad 8 
303 p ad 9 
304 vss 
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3 Programmer Interface 


3.1 Registers 


Table 7 lists the host accessible registers in REX3. 


Addresses shown are an offset from the base GIO address of 0x1FnF0000, where n=(0,4,8,C), depending 
upon the strapping of the GIO64 SLOT_NUMBER(1:0) pins. Address offsets beginning with Ox1nnn are 
intended to map corresponding registers into a separate “protected” page. 


Access to address + 0x0800 issues primitive GO command. 


Type "^ " registers are not passed through either BFIFO or GFIFO, and force an immediate action when writ- 
ten to. 


Type “ " registers are associated with the Display Control Bus and go through BFIFO. 


Registers other than type "^" and“ " are associated with the graphics context and go through GFIFO. 
Writes to type “е” registers will stall at the output of GFIFO until the graphics pipeline is idle. 

Type "2c" indicates twos-complement value. 

Type "sm" indicates signed magnitude value. 


Write/Read format bit grouping is shown with location of binary point, (for COLOR registers, 24-bit mode 
binary point shown). "s" refers to sign bit and “o” refers to overflow bit. Parenthesis are used to indicate a 
place holder for unused bits. 


Write format ^" denotes write-only command address. 


Unused bits return 0 when read. 
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Description 





DRAWMODE1 
DRAWMODE0 


Draw mode bits. 





Draw instruction and mode bits. 





LSMODE 
LSPATTERN 


Line stipple mode register. 





Line stipple pattern, (msb = first pixel). 





LSPATSAVE 
ZPATTERN 


Copy of LSPATTERN for pattern restore (LSRESTORE). 





Pattern register, (msb = first pixel). 





COLORBACK 
COLORVRAM 


AGBR/CI opaque patterning color or blendfunction destination color. 





VRAM FASTCLEAR color, (set DRAWDEPTH and RGBMODE first). 





ALPHAREF 
STALLO 


AFUNCTION reference alpha value. 





Forces stall at the output of GFIFO until graphics pipeline is idle. 





SMASKOX 
SMASKOY 


Screenmask 0: min, max boundaries, (window relative GL smask). 





Screenmask 0: min, max boundaries, (window relative GL smask). 





SETUP 
STEPZ 


Performs line/span setup without iteration (ignore DOSETUP). 





Enables ZPATTERN (Z test fail) for one iteration, (current pixel). 





LSRESTORE 
LSSAVE 


Updates LSPATTERN/LSRCOUNT with LSPATSAVE/LSRCNTSAVE. 








Updates LSPATSAVE/LSRCNTSAVE with LSPATTERN/LSRCOUNT. 




























































































0x0100 |XSTART 2c 16.4(7) 16.4(7) | Iterator X start-point (current), full state for context switch. 
0x0104 |YSTART 2c 16.4(7) 16.4(7) | Iterator Y start-point (current), full state for context switch. 
0x0108 XEND 2c 16.4(7) 16.4(7) | Iterator X endpoint, full state for context switch. 
0x010C |YEND 2c 16.4(7) 16.4(7) | Iterator Y endpoint, full state for context switch. 
0x0110 |XSAVE 2c 16 16 | Copy of XSTART integer value for BLOCK addressing MODE. 
0x0114 |ХҮМОУЕ 2c e 16,16 16,16 | X,Y offset from XSTART,YSTART for relative operations (Scr2Scr). 
0x0118 |BRESD 2c 19.8 19.8 | Bresenham “d” error term, for context switch. 
0x011C |BRESS1 2c 2.15 2.15 | Antialiased Bresenham “s1” coverage term, for context switch. 
0x0120 |BRESOCTINC1 3(4),17.3 | 3(4),17.3 | Bresenham octant & "incr1" error term increment value, for cntx switch. 
0x0124 |BRESRNDINC2| 2c | 8(3),18.3 | 8(3),18.3 | Bresenham 8-bit octant rounding mode (msb == octant 1, Isb == octant 8) 
& Bresenham "incr2" error term increment value, for context switch. 

0x0128 |BRESE1 1.15 1.15| Bresenham "e1" constant (minor slope) for antialiased line draw. 
0x012C |BRESS2 2c 18.8 18.8 Antialiased Bresenham “52” coverage term, for context switch. 
0x0130 |AWEIGHTO 8x4 8х4 First half of 16x4-bit antialiased RGB/CI line weight table. 
0x0134 |AWEIGHT!1 8x4 8 x 4| Second half of 16x4-bit antialiased RGB/CI line weight table. 
0x0138 |XSTARTF 12.4(7) GL version of XSTART, (zeros 4 msbs). 
0x013C | YSTARTF 12.4(7) GL version of YSTART, (zeros 4 msbs). 
0x0140 XENDF 12.4(7) GL version of XEND, (zeros 4 msbs). 
0x0144 |YENDF 12.4(7) GL version of YEND, (zeros 4 msbs). 
0x0148 |XSTARTI 2c 16 Integer format for XSTART. 
0x014C |XENDF1 12.4(7) Same as XENDF. 
0x0150 |XYSTARTI 2c 16,16 Packed integer format for XSTART & YSTART. 
0x0154 |XYENDI 2c 16,16 Packed integer format for XEND & YEND. 
0x0158 |XSTARTENDI 2c 16,16 Packed integer format for XSTART & XEND. 

Table 7: REX3 host visible registers. 
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Address Name Type Write Read Description 
0x0200 |COLORRED 012.11 012.11 | Red/CI shade full state (CI modes = 08.11, 04.11; RGB red = 08.15, etc.). 
012.9 12-bit СІ mode shade. (Must first init DRAWMODE1 RGBMODE and 
DRAWDEPTH fields to set this register write mode; not for ctxt restore.) 
0x0204 |COLORALPHA 08.11 08.11 | Full state of alpha shade. 
0x0208 |COLORGRN 08.11 08.11 | Full state of green shade. 
0x020C |COLORBLUE 08.11 08.11 | Full state of blue shade. 
0x0210 |SLOPERED sm | s(7)12.11 Red/Cl DDA slope: "s" 21 on write denotes sm to 2c conversion, in which 
2c 13.11 | case 12.11 result is computed; always “s” is placed into msb of 13.1 field. 
0x0214 |SLOPEALPHA | sm | s(11)8.11 Alpha DDA slope: "s" 21 on write denotes sm to 2c conversion, in which 
2c 9.11 | case 8.11 result is computed; always "s" is placed into msb of 9.1 field. 
0x0218 |SLOPEGRN sm | s(11)8.11 Green DDA slope: "s" 21 on write denotes sm to 2c conversion, in which 
2c 9.11 | case 8.11 result is computed; always "s" is placed into msb of 9.1 field. 
0x021C |SLOPEBLUE sm | s(11)8.11 Blue DDA slope: "s" 21 on write denotes sm to 2c conversion, in which 
2c 9.11 | case 8.11 result is computed; always "s" is placed into msb of 9.1 field. 
0х0220 WRMASK ° 24 24 | Write mask for pixel, OLAY, or PUP/CID planes, (Isbs for 8-bit system). 
0x0224 COLORI 24 Packed BGR or СІ color registers -- zeros fractions. (Must program 
DRAWMODE1 RGBMODE bit first to set color register write mode.) 
0х0228 |COLORX 12.11 Color index shade, zeros overflow bit. 
0х022С |SLOPERED1 sm | s(7)13.11 Same as SLOPERED. 
0x0230 |HOSTRWO 32 32 | Host PIO/DMA data port, most significant word. 
0x0234 HOSTRW1 32 32 Host PIO/DMA data port, least significant word. 
0x0238 |DCBMODE 29 29 Display control bus mode register. 
0x0240 |DCBDATAO 32 32 | Display control bus data port, most significant word. 
0x0244 |DCBDATA1 32 32 Display control bus data port, least significant word. 








SMASK1X 
SMASK1Y 


Screenmask 1: min, max boundary (screen absolute: X11 directionality). 





Screenmask 1: min, max boundary. 





SMASK2X 
SMASK2Y 


Screenmask 2: min, max boundary (screen absolute: X11 directionality). 





Screenmask 2: min, max boundary. 





SMASK3X 
SMASK3Y 


Screenmask 3: min, max boundary (screen absolute: X11 directionality). 





Screenmask 3: min, max boundary. 





SMASK4X 
SMASK4Y 


Screenmask 4: min, max boundary (screen absolute: X11 directionality). 





Screenmask 4: min, max boundary. 





TOPSCAN 
XYWIN 


Y address for top of screen scan line, (0,1023=top,bottom of framebuffer). 





Screen X,Y offset for window relative addressing and coordinate biasing. 





CLIPMODE 
STALL1 


CID, screenmask mode and enable bits. 





Forces stall at the output of GFIFO until graphics pipeline is idle. 





CONFIG 


Miscellaneous configuration bits. 





STATUS 
USER_STATUS 


Chip busy and FIFO status register. Reading clears interrupt status bits. 
Chip busy and FIFO status register for User code. Non-destructive reads. 








DCBRESET 











Table 7: 








Resets the DCB bus state machine and flushes BFIFO. 


REX3 host visible registers. 
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3.1.1 


Control Register Bit Definitions 


The following tables outline the definition of REX3 control register bits. Refer to related sections in Chapter 
3 for discussion of the REX3 drawing, masking, and pixel I/O programming interface. 


DRAWMODEO Register 


3.1.1.1 


Access 


Active 


Description 








OPCODE(1:0) 
ADRMODE(2:0) 
DOSETUP 
COLORHOST 
ALPHAHOST 
STOPONX 
STOPONY 
SKIPFIRST 
SKIPLAST 
ENZPATTERN 
ENLSPATTERN 
LSADVLAST 
LENGTH32 
ZPOPAQUE 
LSOPAQUE 
SHADE 
LRONLY 
XYOFFSET 
CICLAMP 
ENDPTFILTER 
YSTRIDE 











qo ШЕ E ры кїл pp SE и А а с и ЕВ л ЕЕ и пери Е: ee 





I 





Primitive function command. 

Primitive function addressing mode. 

Enables SPAN/BLOCK/I LINE/F LINE/A LINE iterator setup. 
RGB/CI draw source: 0=DDAs; 1=HOSTRW register. 

Alpha draw source: 0=DDA; 1=HOSTRW register. 

Specifies execution tests for X coordinate endpoint reached. 
Specifies execution tests for Y coordinate endpoint reached. 
Disable start-point draw (lines only). 

Disable endpoint draw, freeze iterators at endpoint (lines only). 
Patterning enable. 

Line stipple pattern enable. 

Enables stipple advance at end of line. 

Limits draw primitive to 32 pixels. 

Enables opaque (vs. transparent) stipple mode for ZPATTERN. 
Enables opaque (vs. transparent) stipple mode for LSPATTERN. 
Enables linear shader R,G,B,A/CI DDAs. 

Aborts primitive if initial XSTARTI » XENDI. 

Add XYMOVE to XSTART, YSTART for draw relative operations. 
Enables Cl shader DDA over/underflow clamping for Cl pixels. 
Enables hardware endpoint filtering (A LINE only). 

Enables Y axis increment/decrement by 2 





Table 8: DRAWMODE?O register 














Value Name Description 
00 NOOP Do nothing. 
01 READ Host read from framebuffer using ADRMODE. 
10 DRAW Draw into framebuffer using ADRMODE. 
11 SCR2SCR | Framebuffer to framebuffer copy, (valid with ADRMODE-SPAN/BLOCK). 








Table 9: DRAWMODEO OPCODE(1:0) definition. 
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Value Name Description 

000 SPAN Span (or point) addressing mode. 

001 BLOCK Block addressing mode, advance Y and restore XSTART at end of span. 
010 І LINE Bresenham line addressing mode, integer endpoints. 

011 F LINE Bresenham line addressing mode, fractional endpoints. 

100 A LINE Antialiased Bresenham line addressing mode . 


Table 10: DRAWMODEO ADRMODE (2:0) definition. 








August 13, 1993 


page24 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





























3.1.1.2 DRAWMODE1 Register 
Access Active Description 
2:0 |PLANES(2:0) R/W | 0x1 Specifies which framebuffer planes enabled for R/W access: 
000 none 
001 R/W RGB/CI planes 
010 R/W RGBA planes 
100 R/W OLAY planes 
101 R/W PUP planes 
110 R/W CID planes 
4:3 |DRAWDEPTH(1:0)| R/W | 0x0 Drawn depth of framebuffer PLANES, not including alpha: 
00 Depth - 4 bits 
01 Depth - 8 bits 
10 Depth - 12 bits 
11 Depth - 24 bits 
5 DBLSRC R/W | 0х0 Double-buffer mode pixel read source buffer, (02 bufferO). 
6  YFLIP R/W | 0x0 H Enable GL Y coord mapping: O=origin at upper left; 1=origin at lower left. 
7  |RWPACKED R/W | 0x0 H Enables pixel packing for HOSTRW access. 
9:8 |HOSTDEPTH(1:0) | R/W | охо HOSTRW pixel packing/unpacking: 
00 Pixel depth = 4 bits (1-2-1 BGR or 4 CI) 
01 Pixel depth = 8 bits (3-3-2 BGR or 8 CI) 
10 Pixel depth 2 12 bits (4-4-4 BGR or 12 CI) 
11 Pixel depth = 32 bits (8-8-8-8 ABGR) 
10 |RWDOUBLE R/W | 0x0 H Enables double word (64-bit) host transfers (vs. 32-bit single word). 
HOSTRW(0,1) format for host framebuffer DMA/PIO only. 
11 |SWAPENDIAN R/W | 0x0 H OpenGL SWAP ENDIAN pixel storage attribute. When true, HOSTRW 
short and long packed pixel data have their byte ordering swapped. 
14:12 |COMPARE(2:0) R/W | 0х7 Color compare апа AFUNCTION condition specifier, (conditions ОН ей). 
COMPARE(2) R/W H Enable compare condition: src > dest. 
COMPARE (1) R/W H Enable compare condition: src = dest. 
COMPARE(0) R/W H Enable compare condition: src < dest. 
15 |RGBMODE R/W | 0х1 H Selects RGB (vs. Cl) shade, round, dither, compare, and clamp modes. 
16 |DITHER R/W | 0х0 H Enables dithering. 
17 |FASTCLEAR R/W | 0х1 H Enables fast-clear write mode when CID checking disabled (CLIPMODE 
CIDMATCH = OxF). Valid with DRAW SPAN/BLOCK only. 
18 /BLEND R/W | 0х0 H Enable blendfunction. 
21:19 |SFACTOR(2:0) R/W | 0х0 H Blendfunction source blending factor, (see Table 13). 
24:22 DFACTOR(2:0) R/W | 0х0 H Blendfunction destination blending factor, (see Table 14). 
25 |BACKBLEND R/W | 0х0 H Enable COLORBACK to be used for blendfunction destination color. 
26 PREFETCH R/W | 0x0 H Enables host framebuffer pixel prefetch mechanism for PIO reads. 
27 |BLENDALPHA R/W | 0х0 H Selects SFACTOR BF_SA source alpha: ‘1’ = source alpha, ‘0’ = 1.0. 
31:28 |LOGICOP(3:0) R/W | 0x3 Logical operation type, (see Table 12). 
Table 11: DRAWMODE!1 register. 
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Value Symbol Operation 
0000 | LO_ZERO 0 
0001 | LO AND src AND dst 
0010 | LO_ANDR src AND (NOT dst) 
0011 | LO SRC src 
0100 | LO_ANDI (NOT src) AND dst 
0101 | LO_DST dst 
0110 | LO XOR src XOR dst 
0111 | LO OR src OR dst 
1000 | LO NOR NOT (src OR dst) 
1001 | LO XNOR NOT (src XOR dst) 
1010 | LO NDST NOT dst 
1011 | LO ORR src OR (NOT dst) 
1100 | LO NSRC NOT src 
1101 | LO ORI (NOT src) OR dst 
1110 | LO NAND NOT (src AND dst) 
1111 | LO ONE 1 

Table 12: DRAWMODE!1 LOGICOP(3:0) definition. 





Value 


Symbol 


Source Blending Factor 








000 
001 
010 
011 
100 
101 








BF_ZERO 0 
BF_ONE 1 


BF_DC 
BF_MDC 
BF_SA 
BF_MSA 





normalized[destination color (or COLORBACK)] 

1 - normalized[destination color (or СОГОВВАСК)] 
normalized[source alpha] 

1 - normalized[source alpha] 








Table 13: DRAWMODE!1 SFACTOR(2:0) definition. 














Value Symbol Destination Blending Factor 
000 | BF ZERO 0 

001 | BF ONE 1 

010 | BF SC normalized[source color] 
011 | BF MSC 1 - normalized[source color] 
100 | BF SA normalized[source alpha] 
101 | BF MSA 1 - normalized[source alpha] 





Table 14: DRAWMODE!1 DFACTOR(2:0) definition. 
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3.1.1.3 


Access 


LSMODE Register 


Active 


Description 








LSRCOUNT(7:0) 


LSREPEAT(7:0) 
LSRCNTSAVE(7:0) 
LSLENGTH(3:0) 














Current value of LSREPEAT down counter, (advance LS pattern when 0). 
Line stipple pattern (bit expansion) repeat factor, (1 £ LSREPEAT £ 255). 
Copy of LSRCOUNT, (updated with write to LSSAVE register address). 
Length of LSPATTERN, from 17 to 32, starting with msb, (0000=17). 





Table 15: LSMODE register. 





























3.1.1.4 CLIPMODE Register 
Bits Name Access| Init | Active Description 
4:0 |ENSMASK(4:0) R/W | 0х0 H Individual enables for SMASK4:0. 
8:5 |<reserved> R/W | 0х0 
12:9 |CIDMATCH(3:0) R/W | 0х0 CID codes to compare, results OR'ed: 
CIDMATCH(3) H selects CID code 11 for CID check 
CIDMATCH(2) H selects CID code 10 for CID check 
CIDMATCH(1) H selects CID code 01 for CID check 
CIDMATCH(0) H selects CID code 00 for CID check 
Table 16: CLIPMODE register. 
3.1.1.5 STATUS Register/USER_STATUS Register 


Access 


Active 


Description 








VERSION(2:0) 
GFXBUSY 
BACKBUSY 
VRINT 


VIDEOINT 


GFIFOLEVEL(5:0) 
BFIFOLEVEL (4:0) 
BFIFO_INT 


GFIFO_INT 

















Revision code, (001 = 1st revesion). 
Indicates graphics pipeline not idle or GFIFO not empty. 
Indicates backend pipeline not idle or BFIFO not empty. 


Video controller vertical retrace interrupt. VR_INT_N falling-edge 
detected, generating VV_INT interrupt. Cleared by the read of STATUS, 
not cleared by the read of USER_STATUS. 


Video option interrupt VIDEO_INT_N status, generating VV_INT 
interrupt. 


Current GIO graphics FIFO level, (0 = empty FIFO). 
Current display bus FIFO level, (0 = empty FIFO). 


BFIFOLEVEL above BFIFODEPTH interrupt was generated. Cleared by 
the read of STATUS, not cleared by the read of USER_STATUS. 
Provides sticky status of BFIFO above FIFO_INT_N interrupts. 


GFIFOLEVEL above GFIFODEPTH interrupt was generated. Cleared by 
the read of STATUS, not cleared by the read of USER_STATUS. 
Provides sticky status of GFIFO above FIFO_INT_N interrupts. 


Table 17: STATUS register. 
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3.1.1.6 CONFIG Register 


Access 


Active 


Description 








GIO32MODE 


BUSWIDTH 
EXTREGXCVR 


BFIFODEPTH(3:0) 


BFIFOABOVEINT 


GFIFODEPTH(4:0) 


GFIFOABOVEINT 


TIMEOUT (2:0) 


VREFRESH(2:0) 
FB TYPE 














When set, the REXS will assume that the information sent by the host 
during the byte count cycle of a GIO bus transfer is in GIO32 bus format. 
When cleared, GIO64 byte count cycles are assumed. When GIO32 
mode is selected, EXTREGXCVR should also be set, and BUSWIDTH 
should be cleared. 


Denotes the physical width of the GIO64 bus. 1-64 bits, 0232 bits 


Denotes the presence of external registered transceivers separating the 
pipelined from the non-pipelined GIO64 bus. 


Display bus FIFO high/low trigger depth: stalls GIO bus and enables GIO 
timeout counter when BFIFOLEVEL+BFIFODEPTH and BFIFABOVEINT 
is set. Host FIFO interrupt is generated when BFIFOLEVEL becomes 
less than BIFODEPTH and BFIFOABOVEINT is cleared. 


Display bus FIFO interrupt select. When set, GIO bus stalls and GIO 
timeout counter is enabled when BFIFOLEVEL: BFIFODEPTH. When 
cleared and BFIFOLEVEL becomes less than BIFODEPTH, a host FIFO 
interrupt is generated. 


GIO graphics FIFO high/low trigger depth: stalls GIO bus and enables 
GIO timeout counter when GFIFOLEVEL+GFIFODEPTH and 
GFIFOABOVEINT is set. Host FIFO interrupt is generated when 
GFIFOLEVEL becomes less than GIFODEPTH and GFIFOABOVEINT is 
cleared. 

GFIFO interrupt select. When set, GIO bus stalls and GIO timeout 
counter is enabled when GFIFOLEVEL+ GFIFODEPTH. When cleared 
and GFIFOLEVEL becomes less than GIFODEPTH, a host FIFO interrupt 
is generated. 


GIO bus timeout interval: 000=0.96msec, 001=1.44msec... 111=4.32msec. 
Timeout generates host FIFOFULL interrupt and unstalls GIO bus. 
Number of VRAM refresh cycles per transfer cycle, 000-refresh disabled. 
Framebuffer fastclear column mask mode select: 

0 


TI mode: replicate 4-bit comumn mask 
1 non-Tl mode: zero-fill comumn mask 4 msbs 





Table 18: CONFIG register. 
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3.1.1.7 


DCBMODE Register 


Access 


Active 


Description 








DATAWIDTH(1:0) 


ENDATAPACK 


ENCRSINC 
DCBCRS(2:0) 
DCBADDR(3:0) 
ENSYNCACK 


ENASYNCACK 


CSWIDTH(4:0) 
CSHOLD(4:0) 

CSSETUP(4:0) 
SWAPENDIAN 

















Width of the data being transferred for each DCBDATAO or DCBDATA1 
word. Needed to support the OpenGL SWAP ENDIAN construct, and to 
allow RGB triplets to be packed into words. 

00 4 bytes 

01 1 byte 

10 2 bytes 

11 3 bytes 
Determines the use of the DATAWIDTH field for packed/unpacked data. 
When set, all bytes addressed by DCBDATA will be transferred. When 
clear, only DATAWIDTH bytes in each addressed DCBDATA word will be 
transferred 
Enables ОСВ CRS(2:0) auto-increment following each DCB transfer. 
Display bus control register select address. 
Display bus slave address. 


Enables display control bus protocol with synchronous acknowledge of 
data transfer with ОСВ ACK N 


Enables display control bus protocol with asynchronous acknowledge of 
data transfer (four-edge handshake protocol with ОСВ CS N and 
DCB ACK N). 


# GIO CLK cycles width for DCB CS N. 
# GIO СІК cycles hold time before DCB CS М de-asserted. 
# GIO CLK cycles setup before DCB CS М“ asserted. 


OpenGL SWAP ENDIAN pixel storage attribute. When true, DCBDATA 
short and long packed pixel data have their byte ordering swapped. 


Table 19: DCBMODE register. 
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3.2 Coordinate System 


There are several ways to describe the coordinate system in REX3. First, its framebuffer contains a region 
of 1280 x 1K pixels which сап be displayed on a monitor. To the right of this area is an “off-screen” or non- 
displayed section of memory which is 64 pixels wide, adjacent to the right edge of displayable memory. 


The physical coordinates for this displayable space are: 4K,4K for the upper left corner, and 4K+1279, 5K- 
1 for the lower right corner. The lower right corner becomes 5K+63 including the off-screen memory space. 


The X11 window system normally considers the upper left region of displayable memory as being at 0,0; in 
order to achieve this with REX3, the window relative bias register XYWIN is loaded with a 4K,4K offset val- 
ue. This allows the X11 coordinate system to be used directly with REX3, which supports the full 16b,16b 
addressability (-32K through +32k-1 along each axis), without the need for host clipping. 


The GL implementation running on REXS relies on float-to-fixed point coordinate transformation shortcuts 
which result in biased coordinates; this bias is hardwired within REXS to a value of 4K,4K. Assuming that 
the GL makes use of exactly this bias value, applicatons which rely on transformed coordinates do not need 
to load XYWIN with the 4K,4K bias; instead, the XYWIN register is used for window-relative offset, from the 
displayed screen origin to the origin of the GL window of interest: xrel, yrel. If the GL uses a bias differing 
from 4K then XYWIN must be explicitly biased by the value (GL bias minus АК) so as to yield values of: 
(xrel + GL bias - 4K), (yrel + GL bias -4K). 


The GL relies on a subset of the X11 address space, limited to 8K x 8K (0 thru 8K-1 along each axis, where 
our origin is centered at or about 4K,4K, depending on the bias mentioned above) . When the Y axis is 
increasing in downward direction (X11 system, which some GL code has been modified to conform to), the 
DRAWMODE!1 bit YFLIP is set to zero, and all window and screen origins are referenced to the upper left 
of respective area rectangles. When the Y axis increases upward (the usual GL convention) the DRAW- 
MODE! bit YFLIP is set to one; now all window and screen origins are referenced to the lower left of re- 
specitve area rectangles. In this case, XYWIN must be set to (0+xrel, 9K-1-yrel), where xrel,yrel are signed 
distance from screen origin to window origin (all lower left here), assuming 4K,4K biasing of the X; YSTART 
and X, YEND coordinate values. This becomes a little more complex if biasing differs: (xrel + GL bias - АК), 
(5K + GL bias -1 - yrel). 


Subpixel positioning of XSTART, YSTART, XEND, YEND of 4 bits are supported for line drawing, including 
antialiasing and endpoint filtering. 


An arbitrary signed offset may be applied to XSTART, YSTART via setting DRAWMODEO bit XYOFFSET. 
The signed value in XYMOVE is then applied. (Note: XYOFFSET should never be set for screen-to-screen 
copy mode, which uses XYMOVE for its own offset between source and destination.) 


3.3 Clipping and Masking 


Framebuffer values are conditionally written as a function of sector clipping, screen masking, CID masking, 
afunction, and color compare. (Transparent patterning also conditions the writes, see 3.8.1, Patterning and 
Stippling.) Bits within each write are masked by the 24b WRITEMASK register (in this case, '0' means don't 
write). 


Sector clipping is performed internally by REXS so as їо cull any writes which are outside the legal drawing 
area, defined by VRAM space. This space is described in Section 3.2, Coordinate System. Note that reads 
are not culled, so as to maintain simplified read behavior for DMA and host reads. 


Screen masking is performed via the 5 sets of rectangles described by the SMASK registers. These are 

controlled by the CLIPMODE register, to define invocation of each mask. All screenmasks are selectively 
invoked by the ENSMASK field, and determine whether a given pixel is outside its area. SMASKO is a GL 
mask, which clips drawing outside its region; it is window-relative (affected by XYWIN YFLIP) and conforms 
to GL coordinate behavior (ust be biased in the same way as X and Y coordinates: see previous chapter). 
Locations outside are masked. SMASKS1-4 are X11 general-purpose masks, not window-relative; coordi- 





August 13, 1993 page30 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





nates are absolute, and unaffected by XYWIN or YFLIP, requiring the host to prebias them with the 4K,4K 
offset. 


Overall, a screenmasked pixel тау be written iff it is: 


{(inside any of enabled screenmasks1-4) or (all screenmasks 1-4 disabled)} AND í inside screenmask0 or 
ѕсгеептаѕко disabled}. 


Reads are never screenmasked. 


CID masking is invoked on writes to framebuffer whenever CLIPMODE register bits CIDMATCH are not 
411117. In that case, CID location corresponding to each framebuffer address is read and compared with СІ- 
DMATCH field. If there is a match, the framebuffer write is permitted. CID checking is never performed on 
framebuffer reads. 


Afunction, or alpha function, is a GL feature which allows the user to inhibit framebuffer writes for specified 
compare relationship between source alpha (either from DDA or host, for bit ALPHAHOST=0,1 respective- 
ly) and a specified reference alpha stored in register ALPHAREF. The compare operator is given in the 
DRAWMODE!1 register COMPARE field. 


Color compare is a peculiar feature of old GL releases for aiding in antialiased color index line drawing. For 
RGBMODE-=O0, linedraw antialiased with DRAWMODE!1 bits COMPARE not ‘111’ will conditionally write 
based on source value, destination value comparison. 


Writemasking is specified for 24b field and must match the bit positioning as described in Section 3.9, 
Framebuffer Formats (exception: writes to AUX planes only use lower 12b of the WRITEMASK). The 
WRITEMASK register is also used to specify double buffering, see Section 3.7, Double Buffering, for more 
details. 


3.4 Iterator Overview 


There are four types of hardware iterators in REX3: D,S1,S2; R,G,B,A/X; LSPATTERN,ZPATTERN; X,Y. 
First, the D term Bresenham error stepping iterator for controlling advance of X, Y major axis for Bresenham 
linedraw. Additional iterators are provided for antialiased linedraw, to control pixel coverage: S1 calculates 
the coverage value, in conjunction with S2 which determines secondary pixel direction along the minor axis. 
Second, DDA iterators for СІ and R,G,B,A values for all planes. Third, recirculating iterators for line stipple 
pattern (LSPATTERN) and polygon or Z mask pattern (ZPATTERN). Fourth, integer increment/decrement 
iterators for X,Y of lines, spans and blocks. 


The Bresenham stepper calculates one pixel address and coverage per clock. The Y iterator calculates one 
value per clock (+/-1). The shader DDA calculates one or two pixel values per clock (41,42 times slope). 
The pattern iterators calculate one, two, or four values per clock. The X iterator calculates one, two, four, or 
32 values per clock (+/-1, +2, +/-4, +32). 


Values per pipeline clock are determined as follows: aliased linedrawing, one/clk; antialiasaed linedrawing, 
two/3 clks; shaded DDA spans/blocks, two/clk; flat DDA spans/blocks, four/clk; screen-to-screen block 
copy, per read or write: four/clk; fast clear spans/blocks, 32/clk; host/DMA reads, one to four/clk, and writes, 
опе or two/clk, depending on packed number of values per bus transfer specified in DRAWMODE1. For 
more information on these modes, see Section 3.5, Framebuffer Access Modes. 


Each of these iterators can be loaded with new starting values at the start of each primitive; they compute 
successive values within that primitive, for multiple-pixel primitives. Normally each iterator will retain, after 
primitive completion, the state corresponding with the point after that last drawn. (H/W Note: for back-to- 
back primitives, the completion of the first overlaps with the start of the next, so that the iterator is never 
loaded with the final state of first primitive, should the following primitive load the same iterator.) 


Special mode bits are provided so that connected lines, which cover vertex of intersection twice (once per 
iterated line), don't cause problems. For GL, stippled, connected lines would normally advance line stipple 
twice at intersection; to prevent this, DRAWMODEO bit LSADVLAST is set to zero, to inhibit LSPATTERN 
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advance at end of primitive. The pixel of intersection is, however, drawn twice. This is not desired for X11, 
where lines could be drawn with LOGICOP=xor: then drawing same location twice gives different value than 
drawing once! To handle this, bit SKIPLAST is set to inhibit drawing of endpoint of a line, and retain X,Y 
state of the endpoint. This has the additional advantage of eliminating the need to reset the X,Y starting val- 
ues for successive connected lines (e.g., for integer endpoint case). A SKIPFIRST bit is provided to skip 
first pixel of antialiased line, should host prefer to do the endpoint filtering itself. When this bit is set, the X,Y 
iterators again retain the state of the endpoint. Note: “first” pixel is the first pixel per “GO” event; “last” pixel 
is (are) that corresponding to the major axis end value. 


3.5 Framebuffer Access Modes 


The framebuffer may be accessed as points, lines, spans, or blocks of data. Additionally, REX3 provides 
autoincrementing address features so that a line may be accessed as successive points (or patterned seg- 
ments, for writes); a span as successive points or segments; and a block as successive points, segments, 
or spans. Here the term “segment” is loosely used to refer to a fixed length string of pixels, usually a subset 
of the primitive (line, span, or block row) being iterated. In the following subsections, “Segments |” refers to 
packed host data, using the HOSTRW registers with COLORHOST or ALPHAHOST set; “Segments ІІ” re- 
fers to remaining cases which һауе DRAWMODEO bit LENGTH32 set. 


3.5.1 Lines: Overview 


Line mode is indicated by DRAWMODEO register field ADRMODE-I LINE, F LINE, A LINE. The Bresen- 
ham setup is performed by REX3. This may include subpixel and antialiasing coverage calculations. For 
information on integer versus subpixel positioned cases, see Section 3.6, Linedrawing. 


Line drawing is specified by DRAWMODEO field OPCODE=draw; reading a line by OPCODE=read. 


The endpoint of each line is not drawn if SKIPLAST=1; in this case the X,Y start state remains at the end- 
point; the startpoint is not drawn if SKIPFIRST=1 (note: SKIPFIRST is used at start of each primitive, so it 
should be cleared for second and later segments or points for case of primitive decomposed as such). 


3.5.1.1 Line Draw or Host Read: Points 


A line can be read or written as sequential points by setting STOPONX-STOPONY-O. The state of 
XSTART, YSTART is post-iterated each access, in accordance with the Bresenham algorithm. 


Prior to the first point, the host must write to address- SETUP to have REX calculate octant and Bresenham 
terms. 
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3.5.1.2 Line Draw: Segments II 


Here 32 pixels are drawn per primitive, until end condition reached. This is useful when patterning (LSPAT- 
TERN, ZPATTERN) using the 32b pattern/stipple/z masking registers. 


Prior to drawing the first segment, a write to address- SETUP is required, to perform octant and error term 
initialization. 
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3.5.1.3 Line Draw: Full Line 


This primitive draws a line as one command. 





ШЕСИ ШЕ СЕК ЖЕНГЕН ЕКЕН ПЕК БИЛ ЕКЕН 


3.5.2 Point Draw ог Read 


A point is described by a XSTART, YSTART pair. This may be packed into a single word as a pair of integer 
values (XYSTARTI), or as two words. 


Whether reading or writing points, the ОНАУ/МОРЕО register is initialized with ADRMODE=block, DOSET- 
UP=0. 


A point is written using OPCODE-draw. A collection of points as X,Y pairs per transfer may be written as 
a DMA to rapidly construct an arbitrary, monochrome shape, such as a circle. The DRAWMODEO bit XY- 
OFFSET тау be used to add XYMOVE to these X, Y values. A point is read using OPCODE=read. 





pese ИКСИ ПЕСЕН КЕСЕНІ ШЫ Ж] 


3.5.3 Spans: Overview 


Unlike points, spans require an X endpoint; DRAWMODE field ADRMODE=span is set for spans, indicating 
that X stepping direction is to be implied by sign of (XEND minus XSTART}. Currently there are not plans to 
support Right-to-Left spans. 


Spans may be culled by use of the DRAWMODEO LRONLY bit: it aborts span primitives where (XEND « 
XSTART}, allowing Left-to-Right Only to draw.Spans are drawn using OPCODE-draw; they are read using 
OPCODE=read. 


User beware: the graphics state at the end of span iteration is determined by granularity of X coordinate 
stepping. 
3.5.3.1 Span Draw or Host Read: Segments | 


This span drawing mode uses pixel values from host or DMA obtained through the HOSTRW registers. De- 
pending on the packing format, this could be one to sixteen pixels per 64b word, or to eight per 32b word. 
DRAWMODEO bit COLORHOST (ALPHAHOST) = 1 to indicate pixel source is not DDA. 


This mode is also used for host reads of a span. 


The host must, in advance, issue a write to address- SETUP in order to have REX calculate quadrant. 
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3.5.3.2 Span Draw: Segments II 


This span drawing mode unlike the above, uses the DDA to calculate pixel value. It stops after 32 pixels 
have been iterated, or the X endpoint is reached, whichever comes first. This is useful when using the LS- 
PATTERN or ZPATTERN features, for non-repeating pattern values, such as Z buffering or arbitrary X11 pat- 
terning. For spans of less than 33 pixels in length, the Full Soan mode may be used instead. 


The host must, in advance, issue a write to address=SETUP in order to have REX calculate quadrant. 
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3.5.3.3 Span Draw or DMA Read: Full Span 
This span mode draws a span as a single primitive. 


A monochrome shape which is decomposed into a list of spans can be written using 64b writes as XYEN- 
DI#XYSTARTI. Shape would be redrawn at various locations via use of DRAWMODEO bit XYOFFSET and 
XYMOVE. 


This mode is used for DMA framebuffer reads of a span. 
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3.5.4 Blocks: Overview 


Block mode is specified by DRAWMODEO field OPCODE=block. Drawing is performed on a span-by-span 
basis. At the end of each span, the XY DDA steps the YSTART value and resets the XSTART value to that 
from XSAVE; XSAVE is written whenever XSTART is updated by host. In addition to the coordinates needed 
for a span, the block mode also requires the YEND value. Stepping in the Y direction is implied by sign of 
{YEND minus YSTART}. As mentioned before, there is not support for Right-to-Left spans. 


Block draw is performed with OPCODE=draw; reads via OPCODE=read. 


Polygon filling may use block draw mode to automatically step Y per span; host then sets XSTART, XEND 
per span. YEND is set initially to an extreme so as to simply imply the direction of Y axis stepping per row. 
STOPONY-O for this mode, which means the first three block draw cases below can support this. 


3.5.4.1 Block Draw or Host Read: Segments I 


Block draw in segments from host/memory is done as a sequence of span segment writes; a segment which 
exceeds the block width is truncated so that a segment is never covering two block rows (spans). Host must 
set COLORHOST or ALPHAHOST =1 for this mode. See Span Draw, Segments | for more information. 
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This mode is also used for host reads of framebuffer block. 


The host must, in advance, issue a write to address=SETUP in order to have REX calculate quadrant. 
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3.5.4.2 Block Draw: Segments II 


Each primitive draws 32 pixels, maximum. Used in conjunction with LSPATTERN, ZPATTERN. The block 
mode makes this useful for large character or other bit expansion drawing. Again, a primitive (segment) is 
truncated at the end of each row, and never applied to two rows. See Span Draw, Segments II for more 
information. 


The host must, in advance, issue a write to address=SETUP in order to have REX calculate quadrant. 
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3.5.4.3 Block Draw or Stride DMA Read: Spans 


The block is drawn as a span per primitive, with the XY DDA performing post-increment of Y and reset of 
X. Useful for characters of < 33 pixels width, using bit expansion of LSPATTERN, ZPATTERN. 


Stride DMA reads use this mode. 


The host must, in advance, issue a write to address=SETUP in order to have REX calculate quadrant. 
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3.5.4.4 Block Draw or Linear DMA Read: Full Block 
Draws an upright rectangular region as a single primitive. 


Linear DMA read uses this mode for full block. 





БЕСШ БЕТЕ ПЁ СЕ pels a3 tt te 


3.5.5 Fast Clear 


This drawing mode provides 4x rate, for fast area clear. No support for any per pixel operations, such as 

shade, stipple, dither, blend. Flat fill only, via value previously written by host into the COLORVRAM register. 
The loading of COLORVRAM must be performed after DRAWMODE!1 fields RGBMODE and DRAWDEPTH 
have been set. In addition to the bits shown below, the DRAWMODE! bit FASTCLEAR must be set. DRAW- 
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MODEO register OPCODE=draw, ADRMODE=block or span must be used. CID checking is not allowed for 
this drawing mode. Spans must be Left to Right. 


ЕНЕСІ СЕН СИ БЕЗ ЕНСЕ ЕСУ 


3.5.6 Screen-to-Screen Move 


Screen-to-screen copy is specified by DRAWMODEO field OPCODE=Scr2Scr and ADRMODE-block or 
span. The command setup is similar to the Full Block or Full Span draw, with the addition of a signed offset 
to destination (**unlike REX1, which was offset to source**) specified by XYMOVE. This offset is with re- 
spect to the window origin, and is therefore interpreted with respect to YFLIP. Block move supports Right- 
to-Left spans. The host must order the X,Y start/end points (hence, quadrant) such that the copy does not 
destroy itself in the process, for source area overlapping destination. Using this mode with XYMOVE-O will 
be slower than its obvious optimization. DRAWMODEO bit XYOFFSET should be 0. 
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3.6 Line Draw Instructions 


3.6.1 Bresenham Aliased Line Draw Instructions 


Newport is the first system that uses exclusively Bresenham algorithms as opposed to DDA. The main rea- 
son is that Bresenham is has infinite precision whereas DDA cannot guarantee predictability at any number 
of bits of fraction. The second reason is that aliased lines have a much shorter setup since there is no divi- 
sion for slope computation. The third reason is that by using Bresenham we can unify the hardware and 
the algorithms for drawing both aliased and antialiased lines and polygons. The BRESROUND field of the 
DRAWMODEO register decides how the comparison between d and 0 should be executed: 


If BRESROUND=1 Then // BRESROUND has 8 bits-one for each octant// 


If d < 0 Then // this branch is executed for d < 0 // 
Begin 
End Else // this branch is executed for d >= 0 // 


If BRESROUND=0 Then 


If d =< 0 Then // this branch is executed for d =< 0 // 
Begin 
End Else // this branch is executed for d > 0 // 


By appropriate programming of the BRESROUND bits we can produce hysteresis-free lines. 


3.6.1.1 I line(x1,y1,x2,y2,SKIPLAST,SKIPFIRST) 
integer: х1,у1,х2,у2 


This is ап aliased line with integer endpoints Тһе intent is to have maximum performance at the expense of 
line quality. Bresenham algorithm allows for very short setup (no multiplication/division) and for reproduc- 
ibility of results (always touches the same pixels). REX3 computes the octant. 


The performance is limited by : 

-the time for passing the arguments from the CPU to REX3 over GIO bus 
-the time for generating the setup values by REX3 : d=2dy-dx, etc (5 clocks) 
-time for iterating a new coordinate (1 clock) 


Since the coordinates are integer there are no precision requirements - Bresenham algorithm with integer 
endpoints is infinitely precise. If SKIPFIRST=TRUE the starting point (x1,y1) is not drawn by REX3. 
If SKIPLAST=TRUE the endpoint (x2,y2) is not drawn by REX3. 


Register-level description: 
XYSTARTI=x1,y1 //only the listed registers must be saved at context switch because : // 
XYENDI-x2,y2  //-all variables used by Bresenham are derived from the input variables// 
DRAWMODE: OPCODE-I line 


Context-switched registers: 
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XSTART-x current 
YSTART-y current 


XEND=x2 // necessary for computing the pixel count // 
YEND=y2 // necessary for computing the pixel count // 
BRESD=d // current d value // 
BRESEOCTINC1=octant,incr1 // octant + incr1 for d // 
BRESINC2=incr2 // incr2 for d // 

3.6.1.2 F_Line(x1,y1,x2,y2,SKIPLAST,SKIPFIRST) 


fixed : x1,y1,x2,y2 


This is an aliased line with fractional endpoints The intent is to have maximum performance at the expense 
of line quality. Bresenham algorithm allows for very short setup (two multiplications and no division) and for 
reproducibility of results (always touches the same pixels). REX3 or the CPU computes the octant . The 


performance is limited by : 


-the time for passing the arguments from the CPU to REX3 over GIO bus 


-the time for generating the setup values by REX3 : d=3dy-2dx+2(dx*y_fract-dy*x_fract) .All GL linedrawing 
primitives must use 3dy-2dx due to the way GL views the coordinate system as opposed to X. (12 clocks) 


-time for iterating a new coordinate (1 clock) 


-time for drawing the fractional coverage endpoints 


A serial multiplier is necessary for computing d..Since the multiplicand involved (x frac,y. frac) has very few 
bits a serial multiplier executes the required multiplication in few cycles. If SKIPFIRST=TRUE the starting 


point (x1,y1) is not drawn by REX3. If SKIPLAST=TRUE the endpoint (х2,у2) is not drawn by REX3. 


Register-level description: 
XSTART2x1 //fixed point number in 16.4 format// 
YSTART=y1 //fixed point number in 16.4 format// 
XEND-2x2  //fixed point number// 
ҮЕМО-у2  //fixed point number// 
DRAWMODE: OPCODE-F line 
Context-switched registers: 
XSTART-x current 
YSTART-y . current 
XEND=x2 
YEND=y2 
BRESD=d 
BRESOCTINC1=octant,incr1 
BRESINC2-incr2 
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3.6.2 Bresenham Antialiased Line Draw Instructions 


3.6.2.1 А Line(x1,y1,x2,y2,e1,aa table,SKIPFIRST,SKIPLAST) 
fixed : x1,y1,x2,y2,e1 
array : aa table // angle-compensated table of pixel coverages indexed by s // 


THIS PRIMITIVE WILL ALSO BE USED FOR GENERATING ANTIALIASING EDGES BY MASKING OUT 
THE TOP OR BOTTOM HALF WITH THE HELP OF THE ZPATTERN (BOTTOM HALF)), LSPATTERN 
(TOP HALF) REGISTERS. 

This is an anti-aliased line with fractional endpoints and with angle compensation but without any endpoint 
filtering.It has INFINITE precision in terms of pixel positioning (exactly like | LINE) since it doesn't rely on a 
DDA algorithm in terms of position determination. The intent is to generate high quality lines at 80-9096 the 
speed of aliased blended lines. The width of the line is restricted to 1 - for wider lines (and for polygons) the 
Bresenham antialiasing edge (see 3.6.2.3) should be used. Two pixels (in the minor axis direction) are in- 
terpolated at each major axis iteration . This approach allows for line intensity independent of line angle 
(i.e. independent of pixel density). The performance is limited by : 


-the time required for the CPU to compute the slope е1 and to find the aa table (which is a function of e1) 
in memory 


-the time for passing the arguments from the CPU to REX3 over GIO bus 

-the time for generating the setup : d=3dy-2dx+2(dx"*y_fract-dy*x_fract) ,s=y_fract+e1*(.5-x_fract)-.5 
s*dx=s*2dx=dy-dx+2(dx*y_fract-dy*x_fract) 

-REX8 time for iterating two new coordinates (closely related to each other) (3 clocks/pair) 


A serial multiplier is necessary for computing d and s.If SKIPLAST=TRUE the endpoint (x2,y2) is not drawn 
by REX3. If SKIPFIRST=TRUE the first point (x1,y1) is not drawn by REX3. The algorithm draws two pixels 
(Т апа S) at each iteration. The coverages for these two pixels are derived by indexing into the AWEIGHT 
table with a function of s as described below. The AWEIGHT table needs to be reloaded for every change in 
the line slope e1. Note that here s frac represents the absolute value of the fractional part of s. 


If O=<s=<1 Then 
Begin 
coverage_T=f(s)=f(s frac) //s frac-Fraction(s)// 
coverage_S=f(1-s)=f(1-s_frac)=f(~s_frac) //~s_frac=1.0-s_frac // 
End 
If -12«s«0 Then 
Begin 
coverage T-f(1«s)-f(^s frac) 
coverage S-f(-s)-f(s frac) 
End 


The case for d>0 that includes the subcases 0<s<1 and 1<5<2 reduces to the above case if we manage to 
arrange for -1«s«1.This is done by adding е2-е1-1 prior to rendering the pixels.As it can be seen the 
AWEIGHT table is indexed with s frac and ^s frac (one’s complement of the fractional portion of s). 


Register-level description: 
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XSTART=x1  //fixed point number// 
YSTART=y1 
XEND=x2 
YEND=y2 
aa table-AWEIGHTO,1-function(s,e1) //this table is calculated for each slope and is indexed by s // 
DRAWMODE: OPCODE=A line 
BLEND=enabled 
SFACTOR-BF SA // SFACTOR=alpha// 
DFACTOR-BF MSA // DFACTOR-1-alpha// 
Context-switched registers: 
XSTART-x current 
YSTART-y. current 
XEND=x2 
YEND=y2 
BRESE1=e1 
BRESD=d 
BRESS1=s 
BRESS2=sdx //sdx=s*dx must be context switched// 
BRESOCTINC1=octant,incr1 
BRESINC2-incr2 
AWEIGHTO,AWEIGHT1-aa table 


3.6.2.2 A Edge Тор(х1,у1,х2,у2,е1,аа table,SKIPFIRST,SKIPLAST,ENDPTFILTER) 
fixed : x1,y1,x2,y2,e1 
array : aa table 


THIS PRIMITIVE IS REMOVED FROM REXS INSTRUCTION SET. THE REASON FOR NOT REMOVING 
IT FROMTHE SPEC IS TO ALLOW GL CODERS TO UNDERSTAND WHAT IS THAT THEY NEED TO DO 
IN ORDER TO COMPUTE THE MASKS (ZPATTERN, LSPATTERN) USED FOR 3D ANTIALIASED LINES. 
This is an anti-aliasing polygon edge with fractional endpoints.It differs from the antialiased Bresenham 
line because only one pixel is drawn at each iteration (the pixel external to the polygon).For clockwise poly- 
gons A Edge Top is invoked by the CPU for edges located in octants 1,3,5,7(for even octants the CPU 
must invoke А Edge Bottom).The reason for this is that in octants 1,3,5,7 it is the top pixel that lies outside 
the polygon whereas in octants 2,4,6,8 it is the bottom pixel that lies on the outside.For counterclockwise 
polygons the convention is reversed: CPU must invoke A Edge Top for edges in octants 2,4,6,8 and 

A edge Bottom in octants 1,3,5,7. The overhead for computing the octant is nill since the CPU must do it 
anyways in order to calculate the z-mask.It has INFINITE precision in terms of pixel positioning (exactly like 
| LINE) since it doesn't rely on a DDA algorithm in terms of position determination.Since GL has a very pre- 
cise notion of T-mesh edge it is possible to use this primitive to antialias only the contour of the mesh without 
touching the inner edges.Only the pixels above the infinitely precise line are being rendered (*above" is 
viewed as looking down the axis of maximum motion). The AWEIGHT table needs to be reloaded for every 
change in the line slope e1.Two types of antialiasing edges (top and bottom) have been invented in order 
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to facilitate polygon antialiasing. If a top and bottom edges are drawn between the same pair of points (х1,у1) 
and (x2,y2) an antialiased line will result. The performance is limited by : 


-the time required for the CPU to compute the slope e1 

-the time for passing the arguments from the CPU to REX3 over GIO bus 

-the time for generating the setup : d=3dy-2dx+2(dx*y_fract-dy*x_fract) , s=y_fract+e1*(.5-x_fract)-.5 
s*dx=s*2dx=dy-dx+2(dx*y_fract-dy*x_fract) 

-REXS time for iterating one new coordinate 

-REXS time for drawing the fractional coverage endpoints 


The endpoints may not be drawn in order to simplify the implementation and in order to generate the im- 
pression of sharp vertices. 


Register-level description: 
XSTART=x1  //fixed point number// 
YSTART=y1 
XEND=x2 
YEND=y2 
AWEIGHT0,AWEIGHT1=aa_table(s) 
DRAWMODE: OPCODE-AA Edge Top 

Context-switched registers: 
XSTART-x current 
YSTART-y. current 
XEND=x2 
YEND=y2 
BRESE1=e1 
BRESD=d 
BRESS1=s 
BRESS2-sdx 
BRESOCTINC1=octant,incr1 
BRESINC2-incr2 
AWEIGHTO,AWEIGHT!1-aa table 


3.6.2.3 A Edge Bottom(x1,y1,x2,y2,e1,aa table,SKIPFIRST,SKIPLAST,ENDPTFILTER) 
fixed : x1,y1,x2,y2,e1 
array : aa table 


THIS PRIMITIVE IS REMOVED FROM REXS INSTRUCTION SET. THE REASON FOR NOT REMOVING 
IT FROMTHE SPEC IS TO ALLOW GL CODERS TO UNDERSTAND WHAT IS THAT THEY NEED TO DO 
IN ORDER TO COMPUTE THE MASKS (ZPATTERN, LSPATTERN) USED FOR 3D ANTIALIASED LINES. 
This is an anti-aliasing polygon edge with fractional endpoints.lt differs from the antialiased Bresenham 
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line because only one pixel is drawn at each iteration (the pixel external to the polygon).lt has INFINITE pre- 
cision in terms of pixel positioning (exactly like | LINE) since it doesn't rely on a DDA algorithm in terms of 
position determination.Since GL has a very precise notion of T-mesh edge it is possible to use this primitive 
to antialias only the contour of the mesh without touching the inner edges.Only the pixels below the infinitely 
precise line are being rendered ("below" is viewed as looking down the axis of maximum motion). 


The performance is limited by : 
-the time required for the CPU to compute the slope e1 
-the time for passing the arguments from the CPU to REXS over GIO bus 


-the time for generating the setup : d=3dy-2dx+2(dx*y_fract-dy*x_fract) , s=y_fract+e1*(.5-x_fract)-.5, 
s*dx=s*2dx=dy-dx+2(dx*y_fract-dy*x_fract) 


-REXS time for iterating one new coordinate 
-REXS time for drawing the fractional coverage endpoints 


The endpoints may not be drawn in order to simplify the implementation and in order to generate the im- 
pression of sharp vertices. 


Register-level description: 
XSTART2x1  //fixed point number// 
YSTART=y1 
XEND=x2 
YEND=y2 
AWEIGHT0,AWEIGHT1=aa_table(s) 
DRAWMODE: OPCODE-AA Edge Bottom 

Context-switched registers: 
XSTART-x current 
YSTART-y. current 
XEND=x2 
YEND=y2 
BRESE1=e1 
BRESD=d 
BRESS1=s 
BRESS2-sdx 
BRESOCTINC1=octant,incr1 
BRESINC2-incr2 
AWEIGHTO,AWEIGHT1-aa table 
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Code for X- line with integer endpoints 


Procedure X_line(x1,y1,x2,y2,SKIPLAST,SKIPFIRST) 
integer: х1,у1,х2,у2 
Begin 
//Compute the octant-independent values// 
X=X1, y=y1 
dx=ABS(x1-x2), dy=ABS(y1-y2) 
Coverage=1 
If SKIPFIRST=FALSE Then Write_Pixel(x,y,Coverage) //Starting pixel has coverage=1// 
Case Octant of (x2-x1,y2-y1,dx-dy) : 


1: d=2dy-dx , incr1=2dy , incr2=2(dy-dx), Loop=dx //compute the octant-dependent values// 
inerx1=1,inerx2=1 ,incry1=0,incry2=1 

2: d=2dx-dy , incr1=2dx , incr2=2(dx-dy), Loop=dy //compute the octant-dependent values// 
inerx1=0,inerx2=1 ,incry1=1 ,incry2=1 

3: d=2dx-dy , incr1=2dx , incr2=2(dx-dy), Loop=dy //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=-1 ,incry1=1 ,incry2=1 

4: d=2dy-dx , incr1=2dy , incr2=2(dy-dx), Loop=dx //compute the octant-dependent values// 
inerx1=-1 іпсгх2=-1 ,incry1=0,incry2=1 

Б: d=2dy-dx , іпсгі=2ау , incr2=2(dy-dx), Loop=dx //compute the octant-dependent values// 
inerx1=-1 іпсгх2=-1 ,incry1=0,incry2=-1 

6: d=2dx-dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy //compute the octant-dependent values// 
inerx1=0,inerx2=-1 ,incry1=-1,incry2=-1 

7: d=2dx-dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 ,incry1=-1 ,incry2=-1 

8: d=2dy-2dx , іпсгі=2ау , incr2=2(dy-dx), Loop=dx //compute the octant-dependent values// 
inerx1=1,incerx2=1 ,incry1=0,incry2=-1 


For i=1 to Loop-1 Do 


Begin 
If d<0 Then // s<t , execute a horizontal step// 
Begin 
xX=X+incrx1 //advance to next pixel// 
y=y+incry1 
d=d+incr1 //compute new values for d and s// 


End Else//s>t>0, execute a 45 degree step// 
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Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
d=d+incr2 
End 
Write Pixel(x,y, Coverage) 
End 
If SKIPLAST=FALSE Then Write Pixel(x2,y2, Coverage) 
End 





August 13, 1993 page44 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





Code for aliased line with fractional endpoints 


Procedure GL_Bresenham(x1,y1,x2,y2,SKIPLAST,SKIPFIRST) 
fixed : х1,у1,х2,у2 //x=x_int.x_fract where x fract is 4 bits of precission// 
fixed : dx,dy 
integer: x10,y10,x20,y20,dx_i,dy_i 
//Compute the octant-independent values// 
x10=int(x1) , y10=int(y1)//REX3 computes the flixed->int and the d term// 
х20=іпі(х2) , y20=int(y2) 
x=x10, у-у10 
dx=ABS(x1-x2), dy=ABS(y1-y2) 
dx_i=ABS(x10-x20)-1, dy_i=ABS(y10-y20)-1 
Case Octant of (x2-x1,y2-y1,dx-dy) : 
1: d=8dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 ,іпсгх2=1 ,incry1=0,incry2=1 
2: а-Зох-2ау, іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 іпсгу1=1 ,іпсгу2=1 
temp=x1_fract //swap x and y// 
хі fractzy1 fract 
yl fract-temp 
temp=dx 
dx=dy 
dy=temp 
3: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=-1 ,incry1=1 ,incry2=1 
temp=1-x1_fract //use 1-x_fract left of y-axis.// 
x1_fract=y1_fract 
y1_fract=temp 
temp=dx 
dx=dy 
dy=temp 
4: d=8dy-2dx , іпсгі=2ау , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=-1 ,incrx2=-1 ,incry1=0,incry2=1 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
5: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 


іпсгх1=-1 іпсгх2=-1 іпсгу1=0,іпсгу2=-1 
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X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
y1_fract=1-y1_fract//use 1-y fract below of x-axis// 
6: d=3dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop-dy i //compute the octant-dependent values// 
іпсіхі-О,іпсіх2--1,іпсіуі--1,іпсіу2--1 
temp=1-x1_fract 
x1_fract=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=temp 
temp=dx 
dx=dy 
dy=temp 
7: а-Зах-2ау, incr1=2dx , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 ,incry1=-1 ,incry2=-1 
temp-1-y1 fract//use 1-y_fract below of x-axis// 
y1_fract=x1_fract 
x1 fract-temp 
temp=dx 
dx=dy 
dy=temp 
8: d=3dy-2dx , іпсгі=2ау , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 ,іпсгх2=1 ,incry1=0,incry2=-1 


yl_fract=1-y1_fract//use 1-y_fract below of x-axis// 


d=d+2(dx*y1_fract-dy*x1_fract) //adjust d due to fractional endpoints// 
E=d-2dx //variable used for adjusting the start point up one pixel// 
If E>0 Then 
Begin 

d-E 


X=X+incrx2*~x_major 

y=y+incry2*x_major 
End 
Coverage=1 // or we can use the CPU-calculated coverage // 
//* This section has been removed on 11/5/92 as a result of a discussion with BobS 
If SKIPFIRST=FALSE Then 
Begin 

Write_Pixel(x,y,Coverage) //Starting pixel has coverage=1, can be drawn conditionally// 

End 
For i21 to Loop-1 Do 
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Begin 
Ifd<0 Then 
Begin 
x=x+incrx1 
y=y+incry1 


d=d+incr1 


// s<t , execute a horizontal step// 
//advance to next pixel// 


//compute new values for d and s// 


End Else//s>t>0, execute a 45 degree step// 


Begin//45 degree move// 


x=x+incrx2 

y=y+incry2 

d=d+incr2 
End 


Write_Pixel(x,y, Coverage) 


End 


If SKIPLAST=FALSE Then //Draw the last pixel conditionally// 


Begin 
If d<O Then 
Begin 
X=X+incrx1 
y=y+incry1 


d=d+iner1 


// s<t , execute a horizontal step// 


//advance to next pixel// 


//compute new values for d and s// 


End Else//s>t>0, execute a 45 degree step// 


Begin//45 degree move// 


x=x+incrx2 

y=y+incry2 

d=d+incr2 
End 


Write_Pixel(x,y, Coverage) 


End 
End 
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Code for antialiased line with fractional endpoints and angle compensation, no endpoint filtering 


Procedure Write_Pixel(x,y,alpha) // 0=< alpha<=1 due to looking it up in aa table // 
global variable : new_color //new_color is the current drawing color// 
Begin 


Read Framebuffer(x,y,bckg color) //read the background color at location (x,y)// 

color=alpha*new_color + (1-alpha)*bckg color// alpha represents pixel coverage// 

Write_Framebuffer(x,y,color) /write back the resultant of blending to location (x,y)// 
End 


Procedure GL_AA_Bresenham(x1,y1,x2,y2,e1,c1) 

fixed : х1,у1,х2,у2,е1 //СРО computes the octant// 

агау : aa tableO (s,e1) ,aa_table1(1-s,e1) // This array is a function of slope and is indexed with s frac // 
integer : Octant,x10,y10,x20,y20 

integer : x_major//x_major=1 in octants 1,4,5,8 // 

//e1=dy/dx for x-major . e1=dx/dy for y-major where dx=ABS(x1-x2) and dy=ABS(y1-y2)// 
//Compute the octant-independent values// 

e2=e1-1.0 

х10=іп (х1) , y10=int(y1)//REX3 computes the fixed->int and the d term// 

x20=int(x2) , y20=int(y2) 

x=x10, y=y10 

dx=ABS(x1-x2), ау-АВ5(у1-у2) 

dx_i=ABS(x10-x20)-1, dy_i=ABS(y10-y20)-1 

Case Octant of (x2-x1,y2-y1,dx-dy) : 


1: d=8dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 ,іпсгх2=1 ,incry1=0,incry2=1 ,x_major=1 

2: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 ,incry1=1,incry2=1 ,x_major=0 
temp=x1_fract 
X1_fract=y1_fract 
y1_fract=temp 
temp=x2_fract 


x2_fract=y2_fract 
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y2_fract=temp 
temp=dx 
dx=dy 
dy=temp 
3: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
incrx120,incrx2--1,incry121,incry221 ,x_major=1 
temp=1-x1_fract/use 1-x_fract left of y-axis// 
X1_fract=y1_fract 
y1_fract=temp 
temp=1-x2_fract/use 1-x_fract left of y-axis// 
x2_fract=y2_fract 
y2_fract=temp 
temp=dx 
dx=dy 
dy=temp 
4: d=8dy-2dx , іпсгі=2ау , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=-1 ,incrx2=-1 ,incry1=0,incry2=1 ,x_major=0 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
X2 Жасі-1-х2 fract 
5: а-Зау-20х, incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=-1 іпсгх2=-1 ,incry1=0,incry2=-1 ,x_major=1 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
yl fract-1-y1 fract//use 1-y fract below of x-axis// 
х2 Тасі-1-х2 fract//use 1-x fract left of y-axis// 
y2 fract-1-y2 fract//use 1-у fract below of x-axis// 
6: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop-dy i //compute the octant-dependent values// 
incrx120,incrx2--1,incry12-1,incry22-1 ,x_major=0 
temp=1-x1_fract 
x1_fract=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=temp 
temp=1-x2_fract 
x2_fract=1-y2_fract//use 1-y_fract below of x-axis// 
y2_fract=temp 
temp=dx 
dx=dy 
dy=temp 
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7: d=3dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 ,incry1=-1,incry2=-1 ,x_major=0 
temp=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=x1_fract 
x1 fract-temp 
temp-1-y2 fract//use 1-у fract below of x-axis// 
y2 Тасі-х2 fract 
х2 fract-temp 
temp=dx 
dx=dy 
dy=temp 
8: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 ,іпсгх2=1 ,incry1=0,incry2=-1 ,x_major=1 
y1_fract=1-y1_fract//use 1-y_fract below of x-axis// 


y2_fract=1-y2_fract 


s=y1_frac-0.5+e1(0.5-x1_ frac) //s for the first pixel// 
sdx=2[(y1_frac-0.5)dx+(0.5-x1_frac)dy]=dy-dx+2(dx*y1_fract-dy*x1_fract) 
// sdx=s*2dx is an infinitely precise number // 
If s<0 Then / The correct , positive s, is in this case s=y1_frac+0.5+e1(0.5-x1_ frac) = s+1 // 
Begin 
5=5+1 
sdx=sdx+2*dx //when s=s+1 sdx=sdx+2*dx // 
End 
d=d+2(dx*y1_fract-dy*x1_fract) //adjust d due to fractional endpoints,this is d for second pixel// 
// s=y1_fract+e1*(1.5-x1_fract)-0.5 this would have been s for second pixel , s=s+e1 // 
E=d-2dx 
If E>0 Then 
Begin 
d-E 
End 


If SKIPFIRSTZTRUE Then 
Begin 
If sdx>0 Then //Compute the coverage for the starting pixel// 
Begin  // THE BOLD CODE MAY BE EXECUTED ON THE HOST IF SKIPFIRSTZTRUE// 
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/! remember that for ymajor lines y1 and x1 have been swapped// 
Coverage_T=(y1_fr-1+¢1/2)(1-0)+.5*e1(1-0)**2 //сопвідег x1 frz0// 
Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 
Coverage Sz(1-0)*c1-Coverage Т //S is below the line and has the larger coverage// 
Write Pixel(x,y,Coverage S) 

End Else //sdx«0// 


Begin 
Coverage Т-(у1 fr«c1/2)(1-0).5*e1(1-0)**2//T has the larger coverage// 
Write Pixel(x,y,Coverage T) 
Coverage Sz(1-0)*c1-Coverage Т  //Sis below the line and has the larger coverage// 
Write Pixel(x-incrx2*-x major,y-incry2*x major,Coverage S) 
End 
End Else 
Begin 


If ѕах>0 Then  //Compute the coverage for the starting pixel// 
Begin // THIS CODE EXECUTED BY REX3 BECAUSE SKIPFIRST=FALSE// 
Coverage Т-аа table(s frac) //Т has the smaller coverage// 
Write Ріхе(х-іпсіх2%-х major,y«incry2*x major,Coverage T) 
Coverage S-aa table(-s frac) //S is below the line and has the larger coverage// 
Write Pixel(x,y, Coverage S) 
End Else // sdx<0 // 
Begin 
Coverage T-aa table(^s frac) //T has the larger coverage// 
Write Pixel(x,y, Coverage T) 
Coverage S-aa table(s frac) 
Write Pixel(x-incrx2*^x major,y-incry2*x major,Coverage S) 


//S is below the line and has the larger coverage// 


End 
End // SKIPFIRST // 
For i21 to Loop-1 Do 


Begin 
If d<O Then // s<t , execute a horizontal step// 
Begin 
xzx4incrx1 //advance to next pixel// 
у-у-іпсіу1 
а-а-іпсі1 //compute new values for d and s// 
5-5-е1 
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sdx=sdx+2dy=sdx+incr1 // s*dx=s*dx+e1*dx i.e sdx=sdx+dy // 


End Else// d>0 results into s>t>0, execute a 45 degree step// 


Begin//45 degree move// 


X=X+iNCrx2 

y=y+incry2 

d=d+incr2 //compute new values for d and s// 

s=S+e2 // this brings s back into the interval [-1,1] // 


sdx=sdx+2(dy-dx)=sdx+incer2 //s*dx=s*dx+e2*dx=s*dx+(dy/dx-1)*dx=s*dx+dy-dx_ і.е. sdx=sdx+dy-dx // 


End 

If sdx>0 Then 

Begin 
Coverage_T=aa_table(s_frac) 
Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 

//~s_frac=0.f-s_frac // 


// s_frac=Fraction(s) // 


Coverage_S=aa_table(~s_frac) 
Write_Pixel(x,y,Coverage_S) 
End Else 
Begin 
Coverage_T=aa_table(~s_frac) 
Write_Pixel(x,y,Coverage_T) 
Coverage S-aa table(s frac) 
Write Pixel(x-incrx2*^x major,y-incry2*x major, Coverage S) 
End 
End 
If SKIPLAST=TRUE Then //THIS CODE MAY BE EXECUTED ON THE HOSTIF SKIPLAST=TRUE// 
Begin 
If sdx>0 Then //Compute the coverage for the ending pixel// 
Begin //Correct the endpoint(s) if start point «» end point// 
Coverage 5-(1-с1/2-у2 їг)*1+.5*е1*1г**2//сопѕіаег x2 frz1// 
Write Pixel(x,y, Coverage S) 
Coverage Т-с1%1-Соуегаде S//T is above the line and has the smaller coverage// 
Write Pixel(x-incrx2*-x majory-sincry2*x major,Coverage T) 


End Else 

Begin 
Coverage Sz(c1/2-y2fr)*14.5*e1*1**2 
Write Pixel(x-incrx2*-x major,y-incry2*x major,Coverage S) 
Coverage Т-с1%1-Соуегаде S//T is above the line and has the larger coverage// 


IIS is below and has the smaller coverage// 
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Write_Pixel(x,y,Coverage_T) 
End 
End Else 
Begin //Ғог SKIPLAST=FALSE REXG fills the last pixel // 
If а<0 Then // s<t , execute a horizontal step// 
Begin 
X=x+incrx1 //advance to next pixel// 
y=y+incry1 
5-5-е1 
sdx=sdx+2dy=sdx+incr1 
End Else// d>0 results into s>t>0, execute a 45 degree step// 
Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
5-5-е2 // this brings s back into the interval [-1,1] // 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If sdx>0 Then 
Begin 
Coverage_T=aa_table(s_frac) 
Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 
Coverage_S=aa_table(~s_frac) 
Write_Pixel(x,y,Coverage_S) 
End Else 
Begin 
Coverage_T=aa_table(~s_frac) 
Write_Pixel(x,y,Coverage_T) 
Coverage S-aa table(s frac) 
Write Pixel(x-incrx2*^x major,y-incry2*x major, Coverage S) 
End 
End 
End //SKIPLAST// 
End 
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Code for antialiased line with fractional endpoints and angle compensation, with endpoint filtering 


Procedure Write_Pixel(x,y,alpha) // 0=< alpha<=1 due to looking it up in aa table // 
global variable : new_color //new_color is the current drawing color// 
Begin 


Read Framebuffer(x,y,bckg color) //read the background color at location (x,y)// 

color=alpha*new_color + (1-alpha)*bckg color// alpha represents pixel coverage// 

Write_Framebuffer(x,y,color) /write back the resultant of blending to location (x,y)// 
End 


Procedure GL_AAE_Bresenham(x1,y1,x2,y2,e1,c1) 
fixed : х1,у1,х2,у2,е1 //СРО computes the octant// 


агау : aa tableO (s,e1) ,aa_table1(1-s,e1) // This array is a function of slope and is indexed with s // 


integer : Octant,x10,y10,x20,y20 

integer : x_major//x_major=1 in octants 1,4,5,8 // 

//e1=dy/dx for x-major . e1=dx/dy for y-major where dx=ABS(x1-x2) and dy=ABS(y1-y2)// 
//Compute the octant-independent values// 

e2=e1-1.0 

x10=int(x1) , y10=int(y1)//REX3 computes the fixed->int and the d term// 
х20=іпі(х2) , y20=int(y2) 

Х-Х10, у-у10 

dx=ABS(x1-x2), ау-АВ5(у1-у2) 

dx_i=ABS(x10-x20)-1, ду і-АВ5(у10-у20)-1 

Case Octant of (x2-x1,y2-y1,dx-dy) : 


1: d=8dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=1,incerx2=1 ,incry1=0,incry2=1 ,x_major=1 
2: а-Зох-2ау, іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
inerx1=0,incerx2=1 іпсгу1=1 ,incry2=1 ,x_major=0 
temp=x1_fract 
хі fractzy1 fract 
yl fract-temp 
temp=x2_fract 
x2_fract=y2_fract 
y2_fract=temp 


temp=dx 





August 13, 1993 


page54 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





dx=dy 
dy=temp 
3: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
incrx1=0,incrx2=-1,incry1=1,incry2=1 ,x_major=1 
temp=1-x1_fract/use 1-x_fract left of y-axis// 
x1 fractzy1 fract 
yl fract-temp 
temp=1-x2_fract/use 1-x fract left of y-axis// 
x2_fract=y2_fract 
y2_fract=temp 
temp=dx 
dx=dy 
dy=temp 
4: а-Зау-2ах, incri=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=-1 іпсгх2=-1 іпсгу1=0,іпсгу2=1 ,x_major=0 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
x2 fract-1-x2 fract 
Б: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=-1 іпсгх2=-1 ,incry1=0,incry2=-1 ,x_major=1 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
yl Тасі-1-у1 fract//use 1-y fract below of x-axis// 
x2 fract-1-x2 fract//use 1-x fract left of y-axis// 
y2 fract-1-y2 fract//use 1-у fract below of x-axis// 
6: d=3dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop-dy i //compute the octant-dependent values// 
inerx1=0,incerx2=-1 ,incry1=-1,incry2=-1 ,x _major=0 
temp=1-x1_fract 
x1_fract=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=temp 
temp=1-x2_fract 
x2_fract=1-y2_fract//use 1-y_fract below of x-axis// 
y2_fract=temp 
temp=dx 
dx=dy 
dy=temp 
7: d=3dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 


іпсгх1=0,іпсгх2=1 іпсгу1=-1 іпсгу2=-1 ,x_major=0 
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temp=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=x1_fract 
x1 fract-temp 
temp-1-y2 fract//use 1-у fract below of x-axis// 
y2 fract-x2 fract 
x2 fract-temp 
temp=dx 
dx=dy 
dy=temp 
8: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=1,incerx2=1 ,incry1=0,incry2=-1 ,x_major=1 
y1_fract=1-y1_fract//use 1-y_fract below of x-axis// 


y2_fract=1-y2_fract 


s=y1_frac-0.5+e1(0.5-x1_ frac) //s for the first pixel// 
sdx=2[(y1_frac-0.5)dx+(0.5-x1_frac)dy]=dy-dx+2(dx*y1_fract-dy*x1_fract) // sdx=s*2dx is ап 
infinitely precise number // 
If s<0 Then  // The correct , positive s, is in this case s=y1_frac+0.5+e1(0.5-x1_ frac) = s+1 // 
Begin 
5=5+1 
sdx=sdx+2*dx //when s=s+1 sdx=sdx+2*dx // 
End 
d=d+2(dx*y1_fract-dy*x1_fract) //adjust d due to fractional endpoints,this is d for second pixel// 
// s=y1_fract+e1*(1.5-x1_fract)-0.5 this would have been s for second pixel , s=s+e1 // 
E=d-2dx 
If E>0 Then 
Begin 
d-E 
End 


If SKIPFIRST=TRUE Then 
Begin 
If sdx>0 Then //Compute the coverage for the starting pixel// 
Begin  // THE BOLD CODE MAY BE EXECUTED ON THE HOST IF SKIPFIRSTZTRUE// 
Coverage Т-(у1 fr-14c1/2)(1-x1 fr)-.5*ei(1-x1 fr)"2  //T has the smaller coverage// 


Write Pixel(x-incrx2*-x та|ог,у-іпсгу2%х major,Coverage T) 
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Coverage S=(1-x1_fr)*c1-Coverage T //S is below the line and has the larger coverage// 
Write_Pixel(x,y,Coverage_S) 
End Else //sdx<0 // 
Begin 
Coverage_T=(y1_fr+c1/2)(1-x1_fr)+.5*e1(1-x1_fr)**2//T has the larger coverage// 
Write_Pixel(x,y,Coverage_T) 
Coverage Sz(1-x1 fr)'ci-Coverage Т //S is below the line and has the larger coverage// 
Write_Pixel(x-incrx2*~x_major,y-incry2*x_major,Coverage_S) 
End 
End Else 
Begin 
If ѕах>0 Then //Compute the coverage for the starting pixel// 
Begin  //THIS CODE EXECUTED BY REX3 BECAUSE SKIPFIRST=FALSE// 
Coverage Т-аа  table(s frac*(1-x1 fr)  //T he coverages are inversely proportional with x1 fr// 
Write Ріхе(х-іпсіх2%-х major,y«incry2*x major,Coverage T) 
Coverage S-aa table(-s frac'(1-x1 fr)) //S is below the line and has the larger coverage// 
Write Pixel(x,y, Coverage S) 
End Else // sdx«0// 
Coverage Т-аа table(-s frac*(1-x1 fr)) //T has the larger coverage// 
Write Pixel(x,y, Coverage T) 
Coverage S-aa table(s frac*(1-x1 fr) //S is below the line and has the larger coverage// 
Write Pixel(x-incrx2*^x major,y-incry2*x major,Coverage S) 
End 
End // SKIPFIRST // 
For i=1 to Loop-1 Do 
Begin 
If d<0 Then // s<t , execute a horizontal step// 
Begin 
xzx4incrx1 //advance to next pixel// 
у-у-іпсіу1 
а-а-іпсі1 //compute new values for d and s// 
s=s+el 
sdx=sdx+2dy=sdx+incr1 
End Else// d>0 results into s>t>0, execute a 45 degree step// 
Begin//45 degree move// 


x=x+incrx2 
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y=y+incry2 
а=а+іпсг2 //compute пем values for d апа s// 
s=s+e2 // this brings s back into the interval [-1,1] // 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If sdx>0 Then 
Begin 
Coverage Т-аа table(s frac) // s_frac=Fraction(s) // 


Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 
Coverage S-aa table(-s frac) //-5 frac-1-s frac // 
Write Pixel(x,y, Coverage S) 
End Else 
Begin 
Coverage Т-аа table(^s frac) 
Write Pixel(x,y, Coverage T) 
Coverage S-aa table(s frac) 
Write Pixel(x-incrx2*^x major,y-incry2*x major,Coverage S) 
End 
End 
If SKIPLAST=TRUE Then //THIS CODE MAY BE EXECUTED ON THE HOSTIF SKIPLAST=TRUE// 
Begin 
If sdx>0 Then //Compute the coverage for the ending pixel// 
Begin//Correct the endpoint(s) if start point «» end point// 
Coverage 6-(1-с1/2-у2 fr)x2 їг+.5*е1*х2 fr**2//s is below and has the larger coverage// 
Write Pixel(x,y, Coverage S) 
Coverage Tzc1*x2fr-Coverage S//T is above the line and has the smaller coverage// 
Write Pixel(x«-incrx2*-x majory-sincry2*x major,Coverage T) 
End Else // sdx<0// 
Begin 
Coverage Sz(c1/2-y2fr)x2fr4.5*e1*x2fr"2  //S is below and has the smaller coverage// 
Write Pixel(x-incrx2*-x major,y-incry2*x major,Coverage S) 
Coverage Т-с1%х2 fr-Coverage S//T is above the line and has the larger coverage// 
Write Ріхе(х,у,Соуегаде T) 
End 
End Else 


Begin //For SKIPLAST=FALSE REXG fills the last pixel // 
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If d<O Then // s<t , execute a horizontal step// 
Begin 
xzx4incrx1 //advance to next pixel// 
у-у-іпсіу1 
5-5-е1 


sdx=sdx+2dy=sdx+incr1 
End Else// d>0 results into s>t>0, execute a 45 degree step// 
Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
s=s+e2 // this brings s back into the interval [-1,1] // 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If вах>0 Then //The coverages are directly proportional with x2_fr// 
Begin 


Coverage_T=aa_table(s_frac*x2_fr) 





Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 
Coverage_S=aa_table(~s_frac*x2_fr) 
Write_Pixel(x,y,Coverage_S) 

End Else 

Begin 
Coverage_T=aa_table(~s_frac*x2_fr) 
Write_Pixel(x,y,Coverage_T) 


Coverage_S=aa_table(s_frac*x2_fr) 





Write_Pixel(x-incrx2*~x_major,y-incry2*x_major,Coverage_S) 
End 
End 
End //SKIPLAST// 
End 
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Code for polygon antialiasing top edge with fractional endpoints 


Procedure Write_Pixel(x,y,alpha)// 0-< alpha<=1 // 

Begin 

global variable : new_color //new_color is the current drawing color// 

Read Framebuffer(x,y,bckg color) //read the background color at location (x,y)// 
colorzalpha*new color + (1-alpha)*bckg color // alpha represents pixel coverage// 
Write Framebuffer(x,y,color) /write back the resultant of blending to location (x,y)// 
End 


Procedure GL АА Bresenham Edge(x1,y1,x2,y2,e1) 

fixed : x1,y1,x2,y2,e1 //e1-dy/dx for x-major . e1=dx/dy for y-mjor // 
array : аа tableO(s) // for antialiasing edges we may not need angle compensation // 
integer : x_major//x_major=1 in octants 1,4,5,8 // 

//Compute the octant-independent values// 

е2-е1-1 

x10=int(x1) , у10-іп(у1)//ҺЕХЗ computes the fixed->int and the а term// 
x20=int(x2) , y20=int(y2) 

X=Xx1, y=y1 

dx=ABS(x1-x2), dy=ABS(y1-y2) 

dx_i=ABS(x10-x20)-1, dy_i=ABS(y10-y20)-1 


Case Octant of (x2-x1,y2-y1,dx-dy: 


1: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 іпсгх2=1 ,incry1=0,incry2=1 ,x_major=1 
2: а-Зох-2ау, іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
inerx1=0,inerx2=1 іпсгу1=1 ,incry2=1 ,x_major=0 
temp=x1_fract 
X1_fract=y1_fract 
y1_fract=temp 
temp=dx 
dx=dy 
dy=temp 
3: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 


incrx1=0,incrx2=-1,incry1=1,incry2=1 ,х та|ог-0 
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temp=x1_fract//use 1-x_fract left of y-axis// 
x1_fract=y1_fract 
yl fract-temp 
temp=dx 
dx=dy 
dy=temp 
4: а-Зау-2ах, incri=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=-1 ,incrx2=-1 ,incry1=0,incry2=1 ,x_major=1 
X1_fract=1-x1_fract//use 1-x_fract left of y-axis// 
Б: d=2dy-dx , incri=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=-1 іпсгх2=-1 ,incry1=0,incry2=-1 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
yl fract-1-y1 fract//use 1-у fract below of x-axis// 
6: d=3dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop-dy i //compute the octant-dependent values// 
іпсіхі-О,іпсіх2--1,іпсіуі--1,іпсіу2--1 ,x _major=0 
temp=1-x1_fract 
x1_fract=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=temp 
temp=dx 
dx=dy 
dy=temp 
7: d=83dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 іпсгу1=-1 іпсгу2=-1 ,x_major=0 
temp=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=x1_fract 
x1 fract-temp 
temp=dx 
dx=dy 
dy=temp 
8: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 ,іпсгх2=1 ,incry1=0,incry2=-1 ,x_major=1 


y1_fract=1-y1_fract//use 1-y_fract below of x-axis// 


d=d+2(dx*y1_fract-dy*x1_fract) //adjust d due to fractional endpoints// 
5-у1 fract-0.5«e1*(.5-x1 fract) 
sdx=2[(y1_fract-0.5)dx+(.5-x1_fract)dy]=dy-dx+2(dx*y1_fract-dy*x1_fract) 
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If s<0 Then / Тһе correct , positive s, is in this case s=y1_frac+0.5+e1(0.5-x1_frac) = 5-1 // 
Begin 
5=5+1 
sdx=sdx+2*dx //when s=s+1 sdx=sdx+2*dx // 
End 
E=d-2dx 
If E>0 Then 
Begin 
d-E 
End 


If SKIPFIRST=TRUE Then 
Begin 

If sdx>0 Then //Compute the coverage for the starting pixel// 

Begin  // THE BOLD CODE MAY BE EXECUTED ON THE HOST IF SKIPFIRST=TRUE// 
Coverage_T=(y1_fr-1+¢1/2)(1-x1_fr)+.5*e1(1-x1_fr)**2  //T has the smaller coverage// 
Write_Pixel(x+inerx2*~x_major,y+incry2*x_major,Coverage_T) 

End Else // 54х<0 // 

Begin 

Coverage_T=(y1_fr+c1/2)(1-x1_fr)+.5*e1(1-x1_fr)**2//T has the larger coverage// 
Write_Pixel(x,y,Coverage_T) 
End 
End Else 
Begin 
If ѕах>0 Then  //Compute the coverage for the starting pixel// 
Begin // THIS CODE EXECUTED BY REX3 BECAUSE SKIPFIRST=FALSE// 
Coverage Т-аа  table(s frac*(1-x1 fr)  //T he coverages are inversely proportional with x1 fr// 
Write Ріхе(х-іпсіх2%-х major,y«incry2*x major,Coverage T) 
End Else // sdx«0// 
Coverage Т-аа table(-s frac*(1-x1 fr)) //T has the larger coverage// 
Write Pixel(x,y, Coverage T) 
End 
End // SKIPFIRST // 
For i21 to Loop-1 Do 
Begin 


If d<O Then // s<t , execute a horizontal step// 
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Begin 
X=x+incrx1 //advance to next pixel// 
y=y+incry1 
d=d+incr1 //compute new values for d and s// 
5-5-е1 


sdx=sdx+2dy=sdx+incr1 
End Else//s>t>0, execute a 45 degree step// 
Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
d=d+incr2 
s=s+e2 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If sdx>0 Then 
Begin 
Coverage Т-аа table(s frac) // only the top pixel is antialiased // 
Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 
End Else 
Begin 
Coverage_T=aa_table(~s_frac)  //only the top pixel is antialiased // 
Write Pixel(x,y, Coverage T) 
End 
End // If// 
End // For // 
If SKIPLAST=TRUE Then //THIS CODE MAY BE EXECUTED ON THE HOSTIF SKIPLAST=TRUE// 
Begin 
If sdx>0 Then //Compute the coverage for the ending pixel// 
Begin//Correct the endpoint(s) if start point «» end point// 
Coverage Tzc1*x2fr-Coverage S//T is above the line and has the smaller coverage// 
Write Pixel(x«-incrx2*-x majory-sincry2*x major,Coverage T) 
End Else // sdx<0// 
Begin 
Coverage Т-с1%х2 fr-Coverage S//T is above the line and has the larger coverage// 
Write Ріхе(х,у,Соуегаде T) 
End 





August 13, 1993 page63 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





End Else 
Begin //Ғог SKIPLAST=FALSE REXG fills the last pixel // 
If d<O Then // s<t , execute a horizontal step// 
Begin 
x=X+İncrx1 //advance to next pixel// 
y=y+incry1 
5-5-е1 
sdx=sdx+2dy=sdx+incr1 
End Else// d>0 results into s>t>0, execute a 45 degree step// 
Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
5-5-е2 // this brings s back into the interval [-1,1] // 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If вах>0 Then //The coverages are directly proportional with x2_fr// 
Begin 


Coverage_T=aa_table(s_frac*x2_fr) 





Write_Pixel(x+incrx2*~x_major,y+incry2*x_major,Coverage_T) 
End Else 
Begin 
Coverage_T=aa_table(~s_frac*x2_fr) 
Write_Pixel(x,y,Coverage_T) 
End 
End 
End //SKIPLAST// 
End 
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Code for polygon antialiasing bottom edge with fractional endpoints 


Procedure Write_Pixel(x,y,alpha)// 0-< alpha<=1 // 

Begin 

global variable : new_color //new_color is the current drawing color// 

Read Framebuffer(x,y,bckg color) //read the background color at location (x,y)// 
colorzalpha*new color + (1-alpha)*bckg color // alpha represents pixel coverage// 
Write Framebuffer(x,y,color) /write back the resultant of blending to location (x,y)// 
End 


Procedure GL АА Bresenham Edge(x1,y1,x2,y2,e1) 

fixed : x1,y1,x2,y2,e1 //e1-dy/dx for x-major . e1=dx/dy for y-mjor // 
array : аа tableO(s) // for antialiasing edges we may not need angle compensation // 
integer : x_major//x_major=1 in octants 1,4,5,8 // 

//Compute the octant-independent values// 

е2-е1-1 

x10=int(x1) , у10-іп(у1)//ҺЕХЗ computes the fixed->int and the а term// 
x20=int(x2) , y20=int(y2) 

X=Xx1, y=y1 

dx=ABS(x1-x2), dy=ABS(y1-y2) 

dx_i=ABS(x10-x20)-1, dy_i=ABS(y10-y20)-1 


Case Octant of (x2-x1,y2-y1,dx-dy: 


1: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 іпсгх2=1 ,incry1=0,incry2=1 ,x_major=1 
2: а-Зох-2ау, іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
inerx1=0,inerx2=1 іпсгу1=1 ,incry2=1 ,x_major=0 
temp=x1_fract 
X1_fract=y1_fract 
y1_fract=temp 
temp=dx 
dx=dy 
dy=temp 
3: d=3dx-2dy , іпсгі=2ах , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 


incrx1=0,incrx2=-1,incry1=1,incry2=1 ,х та|ог-0 
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temp=x1_fract//use 1-x_fract left of y-axis// 
x1_fract=y1_fract 
yl fract-temp 
temp=dx 
dx=dy 
dy=temp 
4: а-Зау-2ах, incri=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=-1 ,incrx2=-1 ,incry1=0,incry2=1 ,x_major=1 
X1_fract=1-x1_fract//use 1-x_fract left of y-axis// 
Б: d=2dy-dx , incri=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
inerx1=-1 іпсгх2=-1 ,incry1=0,incry2=-1 
X1_fract=1-x1_fract//use 1-x fract left of y-axis// 
yl fract-1-y1 fract//use 1-у fract below of x-axis// 
6: d=3dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop-dy i //compute the octant-dependent values// 
іпсіхі-О,іпсіх2--1,іпсіуі--1,іпсіу2--1 ,x _major=0 
temp=1-x1_fract 
x1_fract=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=temp 
temp=dx 
dx=dy 
dy=temp 
7: d=83dx-2dy , incr1=2dx , incr2=2(dx-dy), Loop=dy_i //compute the octant-dependent values// 
іпсгх1=0,іпсгх2=1 іпсгу1=-1 іпсгу2=-1 ,x_major=0 
temp=1-y1_fract//use 1-y_fract below of x-axis// 
y1_fract=x1_fract 
x1 fract-temp 
temp=dx 
dx=dy 
dy=temp 
8: d=3dy-2dx , incr1=2dy , incr2=2(dy-dx), Loop=dx_i //compute the octant-dependent values// 
іпсгх1=1 ,іпсгх2=1 ,incry1=0,incry2=-1 ,x_major=1 


y1_fract=1-y1_fract//use 1-y_fract below of x-axis// 


d=d+2(dx*y1_fract-dy*x1_fract) //adjust d due to fractional endpoints// 
5-у1 fract-0.5«e1*(.5-x1 fract) 
sdx=2[(y1_fract-0.5)dx+(.5-x1_fract)dy]=dy-dx+2(dx*y1_fract-dy*x1_fract) 
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If s<0 Then / Тһе correct , positive s, is in this case s=y1_frac+0.5+e1(0.5-x1_frac) = 5-1 // 
Begin 
5=5+1 
sdx=sdx+2*dx //when s=s+1 sdx=sdx+2*dx // 
End 
E=d-2dx 
If E>0 Then 
Begin 
d-E 
End 


If SKIPFIRST=TRUE Then 
Begin 
If sdx>0 Then //Compute the coverage for the starting pixel// 
Begin  // THE BOLD CODE MAY BE EXECUTED ON THE HOST IF SKIPFIRSTZTRUE// 
Coverage Sz(1-x1 Іғ)“с1-Соуегаде T //S is below the line and has the larger coverage// 
Write Pixel(x,y,Coverage S) 
End Else //sdx«0 // 
Begin 
Coverage Sz(1-x1 fr)'ci-Coverage T  //S is below the line and has the larger coverage// 
Write Pixel(x-incrx2*-x major,y-incry2*x major,Coverage S) 
End 
End Else 
Begin 
If ѕах>0 Then  //Compute the coverage for the starting pixel// 
Begin  // THIS CODE EXECUTED BY REX3 BECAUSE SKIPFIRST=FALSE// 
Coverage 5-аа table(-s frac"(1-x1 fr)) //S is below the line and has the larger coverage// 
Write Pixel(x,y, Coverage S) 
End Else // sdx«0// 
Coverage S-aa table(s frac*(1-x1 fr) //S is below the line and has the larger coverage// 
Write Pixel(x-incrx2*^x major,y-incry2*x major,Coverage S) 
End 
End // SKIPFIRST // 
For i21 to Loop-1 Do 
Begin 


If d<O Then // s<t , execute a horizontal step// 
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Begin 
X=x+incrx1 //advance to next pixel// 
y=y+incry1 
d=d+incr1 //compute new values for d and s// 
5-5-е1 


sdx=sdx+2dy=sdx+incr1 
End Else//s>t>0, execute a 45 degree step// 
Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
d=d+incr2 
s=s+e2 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If sdx>0 Then 
Begin 
Coverage S-aa table(-s frac) // only the bottom pixel is antialiased // 
Write_Pixel(x,y,Coverage_S) 
End Else 
Begin 
Coverage S-aa table(-s frac) // only the bottom pixel is antialiased // 
Write Pixel(x-incrx2*^x major,y-incry2*x major,Coverage S) 
End 
End // If// 
End // For // 
If SKIPLAST=TRUE Then //THIS CODE MAY BE EXECUTED ON THE HOSTIF SKIPLAST=TRUE// 
Begin 
If sdx>0 Then //Compute the coverage for the ending pixel// 
Begin//Correct the endpoint(s) if start point «» end point// 
Coverage 6-(1-с1/2-у2 fr)x2 fr«-.5*e1*x2 fr**2//s is below and has the larger coverage// 
Write Pixel(x,y, Coverage S) 
End Else // в4х<0 // 
Begin 
Coverage Sz(c1/2-y2fr)x2fr4.5*e1*x2fr*2  //S is below and has the smaller coverage// 
Write Pixel(x-incrx2*-x major,y-incry2*x major,Coverage S) 
End 
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End Else 
Begin //Ғог SKIPLAST=FALSE REXG fills the last pixel // 
If d<O Then // s<t , execute a horizontal step// 
Begin 
x=X+İncrx1 //advance to next pixel// 
y=y+incry1 
5-5-е1 
sdx=sdx+2dy=sdx+incr1 
End Else// d>0 results into s>t>0, execute a 45 degree step// 
Begin//45 degree move// 
x=x+incrx2 
y=y+incry2 
5-5-е2 // this brings s back into the interval [-1,1] // 
sdx=sdx+2(dy-dx)=sdx+incr2 
End 
If вах>0 Then //The coverages are directly proportional with x2_fr// 
Begin 
Coverage_S=aa_table(~s_frac*x2_fr) 
Write_Pixel(x,y,Coverage_S) 
End Else 
Begin 


Coverage_S=aa_table(s_frac*x2_fr) 





Write_Pixel(x-incrx2*~x_major,y-incry2*x_major,Coverage_S) 
End 
End 
End //SKIPLAST// 
End 
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3.7 Double Buffering 


Double-buffered drawing is supported for pixels. Allowed formats are described in Section 3.9, Framebuffer 
Formats. 


Double-buffering for writes is specified by the pixel depth and format in DRAWMODE1, and implicitly via the 
WRITEMASK, which must be set to match the Table in Section 3.9. Writes to both buffers use replicated 
source data. 


Double-buffered reads are explicitly specified by DRAWMODE! bit DBLSRC. BufferO (or BufferA) is the 
lower significant pixel within the framebuffer data value: see Section 3.9 for details. Pixel format is again 
specified as above, via DRAWMODE1. This handles cases of R-M-W drawing, and host/DMA reads of dou- 
ble-buffered framebuffer. 


Double buffering brings about a peculiarity with LOGICOP function: while the LO DST normally can be 
viewed as a NOOP (write result is simply the original, destination value), the case of double buffer source 
not equal to double buffer destination actually must perform a copy from one buffer to the other. Therefore 
the REXS hardware will treat ГО DST as a copy, not а МООР. 


3.8 Framebuffer Data Values 


Framebuffer data includes pixel, overlay, and CID types; one is specified for each read or write operation, 
using the PLANES field of DRAWMODE!1 register. 


There аге two main sources for drawn data: the DDA, and the host data register, RWHOST1,0. Data source 
is specified ру DRAWMODEO register COLORHOST, ALPHAHOST. For host data, COLORHOST, AL- 
PHAHOST=1 and the data is interpreted using DRAWMODE!1 as specified by fields RWPACKED, RWDOU- 
BLE, HOSTDEPTH. The data is assumed within legal range, no clamping necessary. COLORHOST, 
ALPHAHOST=0 directs the graphics pipeline to make use of the DDA values; in this case, SHADE-1 spec- 
ifies linear shading is performed for successive, iterated values. The bit RGBMODE specifies whether color 
index or RGB values are to be calculated. DDA values of R,G,B,A are clamped each iteration before send- 
ing down the pipeline. As each of these components has an additional, overflow bit at the DDA, a normalized 
range of [-.5 to 41.5) is handled prior to clamping. Color index ОРА values can be clamped to desired range 
by setting the DRAWMODEO bit ENCICLAMP. 


Normally either the DDA or the host value is used, but there is an exception for blend function where both 
are taken: ALPHAHOST=1 with COLORHOST=0 specifies the HOSTRW1,0 alpha fields are to be used to 
blend the DDA R,G,B components. For more information on the Blend Functions, see Section 3.8.4. 


The framebuffer pixel depth to be drawn is specified by DRAWMODE!1 field DRAWDEPTH. In conjunction 
with the rest of the modes mentioned, framebuffer format can be controlled as shown in Section 3.9. 


Other options or modes which affect pixel value include dither, round, antialias, blend, pattern, and logicop. 
These are covered in the following sections. 


3.8.1 Patterning and Stippling 


There are two 32b pattern registers in REX3: LSPATTERN and ZPATTERN. They are enabled via DRAW- 
MODEO bits ENLSPATTERN, ENZPATTERN. This determines whether each are used in the pixel path, and 
whether the pattern iterates during drawing. Each of these patterns can be specified as transparent (mask 
out pixels corresponding to pattern=0), or opaque (substitute a background color for pixels corresponding 

to pattern=0), via bits LSOPAQUE, ZOPAQUE. Opaque patterning relies the background color stored in the 
COLORBACK register. 


The LSPATTERN is used mainly for lines by the GL, or more generally by X11. The LSMODE register con- 
tains a length specifier LSLENGTH (17-32) for pattern recirculation, and a repeat per bit specifier (1-255) 
LSREPEAT to describe iterations of each pattern bit. Context switching is aided by the LSRCOUNT field, 
which contains the iteration state of LSREPEAT counter. The LSREPEAT function is for linedraw only, and 
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must be set to '1' by host otherwise. Similarly, the LSADVLAST function is cleared for connected vectors 
case only, and must be set by host otherwise. 


Wide lines require the line stipple pattern to be reset identically for each wide line segment; this is accom- 
plished via the state in registers LSPATSAVE and LSMODE field LSRCNTSAVE. At the start of drawing a 
wide line, these registers are initialized to the same values as LSPATTERN, LSRCOUNT respectively. For 
all but the first line of a wide line segment, the saved versions are copied into the working registers, using 
command with GO "LSRESTORE". Upon completion of the last line of a wide line segment, command with 
GO “LSSAVE’ is issued to copy iterated state into the saved registers. 


The ZPATTERN is used for patterning and as a Z write enable mask (soft Z). It is always 32b long and re- 
peats. 
When both pattern are enabled, the background color is substituted into the pixel path iff not both pattern 


bits are asserted (e.g., LSPATTERN & ZPATTERN bitwise false). The pixel location can be written iff ((LSO- 
PAQUE+LSPATTERN) & (ZOPAQUE+ZPATTERN)} is bitwise true. 


Z buffering of antialiased lines makes use of both patterns, with ZPATTERN used for the primary pixel mask, 
and the LSPATTERN used for the secondary pixel mask. 
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3.8.2 Dither 


REX3 uses the 4 x 4 Bayer dither matrix. The seventeen intensities created by this dither matrix are illus- 
trated below. 
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4 x 4 Bayer Dither Matrix. 


The least significant two bits of the window X and Y addresses are used to select a value from the dither 
matrix. The matrix value is then compared against the 4 msbs of the target color fraction. The pixel at X, Y 
is intensified if the desired value is greater than the matrix value, otherwise it is not intensified. 


Because this operation would create an overall brightening of the image (and clamping at the high end), the 
pre-dithered pixel values are scaled prior to matrix comparison. 


Dithering is enabled by setting the DRAWMODE!1 register DITHER bit. 
3.8.2.1 RGB Dithering 


If enabled, REX3 dithers 1, 2, 3, and 4-bit stored RGB pixel components. No dithering is performed on 24- 
bit RGB. The following illustrates REXS scaled dithering for 1 through 4-bit RGB components, given an 8- 
bit target pixel value, P[7:0]: 


1-bit (1-2-1): 
Scale P[7:0] by 128/55 (»1/5): 


1. S = P[7:3] x 1/2 = P[7:3] - P[7:4] 
2. if (S[3:0] > DitherMatrix[x,y]) then D = S[4] + 1 
else D - S[4] 


2-bit (1-2-1 and 3-3-2): 
Scale P[7:0] by 192/- (»3/4): 


1. S = P[7:2] - P[7:2]/4 = P[7:2] - P[7:4] 
2. if (S[3:0] > DitherMatrix[x,y] then D = S[5:4] + 1 
else D = S[5:4] 
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3-bit (3-3-2): 

Scale P[7:0] by 224/55 (»7/8): 

1. S = P[7:1] - P[7:4] 

2. if (S[3:0] > DitherMatrix[x,y]) then D = S[6:4] + 1 
else D = S[6:4] 

4-bit (4-4-4): 

Scale P[7:0] by 240/255 (»15/,9): 

1. S = P[7:0] - P[7:4] 

2. if (S[3:0] > DitherMatrix[x,y]) then D = S[7:4] + 1 
else D - S[7:4] 

3.8.2.2 Color Index Dithering 


No scaling is performed for Cl pixels. In REX3, the Cl fraction is clamped before the dither stage so that 
no overflow will occur due to the dither increment operation. 


For antialiased Cl, the integer 4 Isbs are replaced by a 4-bit AWEIGHT (intensity). REX3 then dithers by 

incrementing Cl(4). The DDA-section muxes the original integer 4 1505 to the СІ fraction, so that dithering 
logic always uses the same 4-bit field for matrix comparison. (Dithering has no effect on 4-bit antialiased 

pixels). 


The following illustrates Cl dithering given a 12-bit integer and 4-bit fraction, I[11:0]).F[3:0]: 
СІ 4, 8 and 12-bit, non-antialiased: 

if (F[3:0] > DitherMatrix[x,y]) then D = 1[11:0] + 1 

else D =I[3:0] 
CI 8 and 12-bit, antialias enabled: 

if (F[3:0] > DitherMatrix[x,y]) then D = 1[11:0] + 0x10 

else D =I[7:0] 
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3.8.3 Color rounding 


GL requires the color be rounded to the nearest color. The dithering algorithm takes care of color rounding 
when dithering is enabled. When dithering is turned off, the intensity P of the color is rounded to the nearest 
color according to the algorithm described as follows. 


Non-antialiased Color index: Increment the color if the MSB of the color fraction is 1. 
Antialiased Color index : Increment the color by 16 if bit 3 of the iterated color integer is 1. 
Antialiased 4 bit color index is not rounded. 


RGB 1 bit : The final color D[0] = P[7] the MSB bit of the color . 


RGB 2 bits : S[5:0] = P[7:2] - P[7:2]/4 = P[7:2] - P[7:4] 
The final color D[1:0] = S[5:4] + S[3] 


RGB 3 bits : S[6:0] = P[7:1] - P[7:1]/8 = P[7:1] - P[7:4] 
The final color D[2:0] = S[6:4] + S[3] 


RGB 4 bits : S[7:0] = P[7:0] - P[7:0]/16 = P[7:0] - P[7:4] 
The final color D[3:0] = S[7:4] + S[3] 


RGB 8 bits : The final color D[7:0] = P[7:0] no rounding is performed. 


The rounding of color is performed in the dithering block, which is before the logicop block, therefore the 
source color of the logicop is rounded but the destination color and the logicop result are not rounded. 


3.8.4 Logic OP 


The LOGICOP field of DRAWMODE!1 register defines the logicop operation used to combine the pixels 
being iterated (source pixels) with the pixels already written (destination pixels). Logical operations can be 
performed on any planes. Logical operations are disabled when LOGICOP=3. The logical operation is 
implemented in RB2 chip. 


3.8.5 Blend 


In RGB mode, the system draws pixels using a function that blends the incoming (source) RGBA values 
with the RGBA values that are already in the frame buffer (destination) or the background color register 
COLORBACK (if BACKBLEND in DRAWMODE!1 register is setto 1). The SFACTOR and DFACTOR fields 
of the DRAWMODE!1 register defines the source color multiplier (Fs) and destination color multiplier (Fd) 
used for blending. The blending function is : Cb = Cs*Fs + Cd*Fd, where Cb is blended color , Cs is source 
color and Cd is destination color. The normalization of the alpha and color components in source and des- 


SFACTOR Source Multiplier (Fs) 








Zero 

one 

normalized destination color 

one minus normalized destination color 
normalized source alpha 

one minus normalized source alpha 


Table 20: SFACTOR Definition 
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DFACTOR 


Destination Multiplier (Fd) 





сл + шо о н © 








zero 
one 

normalized source color 

one minus normalized source color 
normalized source alpha 

one minus normalized source alpha 


Table 21: DFACTOR Definition 





tination multipliers are converted from 8-bit integers to numbers between 0 and 1 by adding the MSB to the 
number and dividing by 256. Thus FF becomes 1.0 and 0 remains 0. 
When source multiplier is set to source alpha (SFACTOR=4), alpha component сап be blended in two dif- 
ferent ways depending on how BLENDALPHA bit in the DRAWMODE1 register is set. When BLENDALPHA 
is set to 0, the source multiplier for blending alpha is one instead of source alpha and destination multiplier 
is defined by DFACTOR. When BLENDALPHA is set to 1, alpha is blended the way defined by the SFAC- 


TOR and DFACTOR. 


Blending is enabled by setting BLEND in the DRAWMODE1 register to 1. Enable blender will slow down 
pixel process, therefore blender should not be enabled if it is not used. Blending and logical operation are 


mutually exclusive. 
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3.9 Framebutffer Formats 


Table 22: Frame Buffer Pixel Formats 









































BIT DI'DIDIDDIDIDDDIDIDDDIDDIDIDDIDIDIDIDIDD 
PLANES PIXEL TYPE |2]2/]2/|2]1 1|1]1/1|1]1/1/1/119]8/.7/|[6]5 4|3]2 1/0 
з 2111019 81716 5 413 |21110 
RGB-SB BIRIGIBIRIGIBIRIGIBIRIGIBIRIGIBIRIGIBIRIGIBIRI[IG 
24 24BIT 010 101111111212 121313 13 |414 4 |5 5 5 616 617177 
RGB-DB BIRIGIBIRGIBIRIGIBIRIGIBRIGIBIRIGIB RIGIB RIG 
24 4444444 4 14 |4 15 |515 66167 |77 |4 |4 |4 5 |5 15 61667177 
CI-SB Il l lj yb К jl ll | 
24 12BIT -(|-1-1|-1|-1|-1|-|-|-1|-|-1-11111|9|817|6 |514 |3 |2110 
110 
CI-DB Ill IE yb GL ШШ E l jl l l lj ll l l jl jl ПШПШ ПШПШ 
24 12+12 11119187 16 5 413 2111 01111]918 7 1615 413 121110 
110 110 
RGB-SB RIGIBIR GIBIRIG 
8/24 8BIT 332 -1|-1-1-1-1|-1|-1|-(-1(|-1|-|-1|-|-|-|- 5115161616 77 7 
RGB-DB GIBIRIGIGIBIRI G 
8/24 8BIT -1|-1-1-1-1(-1(|-1|-(-1|-1|-|-1|-|-|-|- 1617171 716|7 7 7 
1214121 
CI-SB Ill l jl р |1 |1 
8/24 8BIT -1|-1-1-1-1-1|-1-(-1|-1|-(-|-(|-|(|- |- [7161514131 21110 
CI-DB Ill l l yb yb lll 
8/24 4+4 -1|-1-1-1-1-1-1-(-1|-1|-(-|-1(|-|(- |- 1312111013 21110 
АСВа-ОВ |а |а [а аво BIRIGBIRIGlalalalaIR G BIR GIBIRIG 
24 3324 + 3324 |4 |5 |6 |7 |5 |5 (6 1616 7|7 714 |Б 6 [7151516 1616 7 7 7 
RGBa - SB а |а ја [а [а ја [а а ВІА G BI RI GIBIRIGIBIRIG 
24 444 8 -|-1- 1- 10111213 14 |Б 16 17 |4 14 |4 |5 15 5 616 617177 
CID/AUX AJlAAJAAJAJAJAJAAJAJAJAJAJAJAIPIP CIC IP IP IC C 
24 2BITS-CID з азів в о [о о [оа |з [г |в о о о ро [зг [а |з в о о о о 
2BITS-PUP |7 (6 (5 14 7 |65 4 |3121 013 211 о1о {1 от 0 1/0 
8BIT AUX 
2BITS-CID PIPICICIPIP I CIG 
8 2BITS-PUP |- |-|-|-|-|- |- |- J- |- |- J- 571417 |- 1818181810 10/10 (0 
1|0|1|01|111|01|110 






















































































NOTES: R - Red, G - Green, B - Blue, In - Color index, С. „ - CIDpixel, bit field, Ррд- PUPpixel, bit field, 


Ар (ы) п - OLAYpixel, (buffer), bit field, a, - Alpha 
Programing of the Planes(2:0), Drawdepth(1:0), and RGBmode bits will allow writing the frame buffer for- 
mats shown in Table 24. 
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ums Drawdepth(1:0) | RGBmode 

RGB-DB 

ee 

CI-SB 

a pæn БЕ 

CI-DB 

UH NEN 

RGB-SB 

RGB-DB 00 1 





СІ-5В 
СІ-ӘВ 
824 |4-4 


RGBa - DB 
24 3324 + 3324 | 

RGBa - SB 
24 e | 


РО 


OLY -SB 100 
24 8 Bit 


Table 23: Frame buffer formats programmed by Planes(2:0), Drawdepth(1:0) and RGBmode 


5 


101 хх 0 


о 
pare 
о 











Note: For all the modes shown as double buffered, software will have to set the writemask for the арргоргі- 
ate buffer. REX3 allows writing to any one plane at a time. When writing to one of the Auxiliary planes (CID, 
PUP or OVERLAY), writemask has to be set to disable writing to the other planes, e.g. the PUP ала OVER- 
LAY planes have to masked when writing to the cid planes. The writemask would have to match the pixel 
formats shown in Table 23. 
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3.10 Framebuffer PIO and DMA 


REX3 supports programmed I/O and DMA reads and writes, from and to all bitplane types. All spanmode 
read and write addressing must step X from left to right. Packing, unpacking, and word size are specified 
via the following DRAWMODE1 bits: RWPACKED, HOSTDEPTH<1:0>, RWDOUBLE. All reads and writes 
are made through the HOSTRW1,0 register pair; for 32b access, only the HOSTRWO is used. All writes to 
framebuffer rely on COLORHOST and/or ALPHAHOST=1 to indicate HOSTRW values, пої DDA, are used. 


The data formats for the HOSTRW1,0 register are illustrated in the accompanying table. Each data value 
resides in a field of 8, 16, or 32 bits as programmed by HOSTDEPTH; the leftmost field is the first one to 
be used; each value is right-aligned within the field, and zero-filled where approriate. RGB data have com- 
ponents ordered such that red is least significant, and blue (or alpha) is most significant, subfield. Reads 
of framebuffer values via HOSTRW registers return undefined values for start-byte masked locations and 
for unused, trailing fields. 


Pixel programmed І/О refers to host reads and writes of either pixels, overlays, popups, ог CID planes. 
REX3 is set up by host to the desired mode via DRAWMODEO and DRAWMODE!1. The bits STOPONX, 
STOPONY should be zero, indicating one GIO word per primitive "GO". For pixel reads, the DRAWMODE1 
PREFETCH bit must be set to 1, with the DRAWMODEO OPCODE=read. The set up of the DRAWMODE 
registers should be performed with a write to the GO (address+800H) command, prefetching the data, re- 
ducing the I/O latency of subsequent transfers. Pixel data may then be read from the HOSTRW register, 
again with the "GO" command. All PIO is context switchable. Reads and writes of the HOSTRW register 
when saving or restoring context should be made without the "GO" command (address 800H bit) being set. 


To the REX3, DMAS are indistinguishable from burst activity, in that the GIO activity is identical. Any distinc- 
tion between these two modes are purely at the level of the kernel and, to some lesser extent, the write buff- 
er (or MC). We will hereafter refer to any burst access as DMA; the term "word" refers to the width of data 
transferred in a bus cycle (for REX3, 4 or 8 bytes, depending on state of CONFIG register BUSWIDTH bit: 

see Section 4.1). To insure correct operation with the MC, all pixel DMA read transfers must be performed 
with DRAWMODE!1 PREFETCH = 0. In addition, a pixel DMA read transfer may not begin until the graphics 
pipe is idle (STATUS GFXBUSY = 0). Pixel DMA transfers must be made with the "GO" command bit set. 


DMA is supported for span and block addressing modes. In either case, each burst is restricted to left-to- 
right stepping per scanline; there is no support for right-to-left or mirroring. Span DMA is supported for ar- 
bitrary byte count and start byte values. Block DMA may have certain restrictions, as noted below. 


There are two main categories of block DMA: linear and stride. A linear block DMA sends data across the 
bus in a single string, so that a block of data in framebuffer is packed into consecutive locations within this 
string and written or read as such to/from main memory. The DMA stride register in the write buffer is set to 
zero for this mode (linewidth = total transfer, linecount = one). There are REXS restrictions on this type of 
block DMA: the start byte (SB) must be zero, and the width per scanline must be an integer number of bus 
words (64b words for GIO64 64b transfers, for instance). This width constraint applies to main memory stor- 
age, and not the actual width іп framebuffer: the REX3 block addressing coordinates are programmed to 
exactly the desired size, and unused bytes contained in the last word per block row are ignored on writes. 
In effect, the end pixel of a block row must never be packed into the same GIO word as the start pixel of the 
next row: this is а REX3 restriction; to overcome this limitation, stride block DMA is used. 


Stride block DMA consists of bursts of contiguous data which are each separated by a constant number of 
bytes, essentially an address gap. It supports a "virtual framebuffer" in main memory with which any block 
subset can be read or written, therefore not necessarily as a single contiguous, linearly addressed string. 
By definition, the linecount register in the write buffer is set to the number of framebuffer rows, and the lin- 
ewidth is set to width of one row. The DMA stride register is usually set to a nonzero value, equal to the 
(byte) distance between bursts in a physically mapped main memory. (However, this value can be zero to 
handle the case of packed, linear data where last pixel of a block row is packed in the same main memory 
word as first pixel of next row: this memory word is then accessed at least twice, once per scanline.) 
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A single kernel call initiates the stride block DMA, which is decomposed by the write buffer into a GIO burst 
per scanline. Each of these scanline DMA's have identical BC value, but SB may vary per scanline and is 
calculated incrementally by the write buffer. (For example, say first SB is SB(0), calculated as Start_Ad- 
dress%8; then subsequent SB is calculated as SB(i) = {SB (i-1) + BC + Stride}%8, for 64b DMA). 


All DMA's can be pre-empted and resumed, with the restriction that during the preemption period of a read 
DMA, no other access is made to the subsystem of REX3 pertaining to the pre-empted DMA. In short, REX3 
contains two main subsystems: one for graphics, the other for the display control bus. DMA with the graph- 
ics subsection can be pre-empted by host in order to perform accesses across the display control bus, but 
not to the graphics. The converse is also true. In any event, no other DMA may be performed with REX3 
during a read DMA preemption. In addition, rev. 0 and rev. 1 REX3 chips do not allow the reading of pixels 
(from HOSTRW) when a read DMA from DCBDATA is preempted. Similarly, rev. 0 and rev. 1 REX3 chips 
do not allow display control bus data (from DCBDATA) to be read while a read DMA from HOSTRW is pre- 
empted. Access to the STATUS, CONFIG, and DCBRESET registers are a third category, or subsection, 
for which this rule applies. Therefore, STATUS may be read by host during any DMA preemption. 


Notes: Unlike REX1, the REX3 supports nonuniform SB for block DMA: therefore the logic will subtract the 
number of pixels represented by SB from XSTART at start of each burst. (Programmers note: you will not 
have to subtract SB from XSTART, which you did for REX1.) 


Host Pixel Packing 


RWDOUBLE RWPACKED| RWDEPTH(1:0) GIO. DATA(63:0) 
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Table 0.2.2.2. HOSTRW pixel packing modes, (big endian format illustrated). 
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The following table summarizes the above cases: 


Table 24: Summary of PIO and DMA Cases 

















Case Context | Bursts |PREFETCH| SB BC STOPONX,Y 
Switchable 

РІО Read | | yes |o | w | 0/4 8/4 0,0 

Linear yes single -- 0-7 length 0.0 

DMA 

Span 

Write 

Linear no single no 0-7 length 1,0 

DMA 

Span 

Read 

Linear yes single 0 length 0,0 

DMA 

Block (ength%8 

Write -0) 

Linear no single no 0 length 1,1 

DMA 

Block (ength%8 

Read =0) 

Stride yes per row -- 0-7 width 0,0 

DMA 

Block per row 

Write 

Stride no per row no 0-7 width 1,0 

DMA 

Block per row 

Read 


(note 1: STOPONX,STOPONY are from DRAWMODEO register; PREFETCH is from DRAWMODE1.) 
(note 2: above is for 64b GIO transfers; for 32b case, SB 0-7 is then 0-3; length?68 is then length%4.) 


(note 3: when a span crosses a page boundary, an additional burst is done; BC is decomposed per burst.) 


3.11 


FIFO Management 


A bus timeout counter (CONFIG register TIMEOUT field) is provided (1-4.3 usec) to generate a 
FIFO INT N interrupt during continuous GRXDLY stalling for host I/O; graphics FIFO is enlarged to 32 
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deep, doubles. (note: GRXDLY is asserted whenever FIFO level meets or exceeds CONFIG GFIFODEPTH 
and CONFIG GFIFABOVEINT is set; at timeout, this FIFOMAX GRXDLY stall is disabled, while 
FIFO_INT_N remains asserted until the FIFO drains below the GFIFODEPTH level ог GFIFOABOVEINT is 
cleared, enabling a “below level” interrupt). DMA uses this mechanism also, though it should never result in 
a timeout because the FIFO drain rate is sufficient to reset the timer-counter frequently. 


In the event that the FIFOMAX level is set too high so that FIFO overflow would occur, the following would 
happen: first, the FIFO would not push overflow data; second, this data would then be lost; third, this is 
ensured by the h/w disallowing push when (fifolevel=32 & !pop). Occurrence of this condition, for system 
prototyping and debug, will be visible via assertion of GRXDLY during any “overflow push” clocks. Logic an- 
alyzer trigger on (FIFO_INT_N & GRXDLY) will capture it; should never trigger in a correctly configured/ 
operating system. 


Host should check STATUS or USER_STATUS for graphics idle (GFXBUSY=0) and GFIFO empty 
(GFIFOLEVEL = 0x00) before beginning a read or a series of reads, to avoid bus timeout. User programs 
should check USER_STATUS so that interrupt information contained in the STATUS register is not cleared. 
Additionally, there are several cases which yield worst-case latency for framebuffer reads which the system 
should accomodate without generating a bus timeout: (a) any PIO read may be delayed by VRAM transfer 
and refresh cycles; this latency increases for slower video rate and interlaced display; (b) maximum mem- 
ory cycles required for a read is for a double-word of packed 8b or 4b data, unaligned in VRAM; (c) the 
transition between block rows is stalled by the graphics pipeline so that current scanline is finished through 
the read packer before the next scanline read is initiated (this is to accomodate the Y swizzle in memory). 
The combined effect of all these cases will yield a worst-case latency for reads which the system must ac- 
серї. (**we should have numbers for this before tapeout!**). 


3.12 Context Switching 


REX3 supports context switching except for during preempted read dma. There аге two main contexts 
which may be switched: graphics context, and display bus context. Graphics context includes the X, Y val- 
ues, colors, and all other modifiers or modes which affect writing into, or reading from, the framebuffer. Dis- 
play bus controller context, by definition, includes those registers specifying target device, interface protocol 
and timing, and data registers used for display bus transactions. Before any context switching can take 
place, the host must poll the STATUS register and wait for the appropriate BUSY bit to be cleared (GFX- 
BUSY for graphics, BACKBUSY for display bus backend). At such time, the context registers are consid- 
ered to be stable. 


A complete graphics context save is performed by first checking for GFXBUSY=0 and GFIFOLEVEL = 0x00, 
then reading all registers except display control bus registers. Only those registers which have a read for- 
mat listed need be saved. In many cases, however, it is likely that only a subset of these registers need be 
saved; itis up to the application to decide this. The read of the HOSTRW or any other register must not be 
issued with a “GO” (address+800H) command. 


A complete graphics context restore is done by writing back all the registers which were saved. There is 
one complication the host must handle during context save/restore of the SLOPERED register: the saved 
value must be converted from a 2's complement (s12.11) to a signed magnitude (s(8)12.11) format before 
writing back to REX3. Restoral of COLORRED must be done with DRAWMODE bit RGBMODE-1 in order 
to circumvent the 12b Cl formatting process. Also note that XSAVE should be restored after XSTART. The 
write to the HOSTRW or any other register must not be issued with a “GO” (address+800H) command. Тһе 
restored process may be started immediately. 


Display bus context switching is done simply by reading or writing the appropriate registers, after waiting for 
ВАСКВОЅҮ=0. If non-atomic transfers are performed to or from the devices on the display bus, their con- 
text will also need to be saved/restored; however, this should no longer be necessary, now that REX3 sup- 
ports packing and unpacking of multiple-byte data onto the bus. 
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3.13 Display Control Bus 


The host communicates with devices on the Display Control Bus (DCB) by first writing to the DCBMODE 
register, and then writing to or reading from the DCBDATA register. Data written to boththe DCBMODE and 
DCBDATA registers pass through the BFIFO (backend fifo) prior to their being used by the DCB state 
machine or transferred on the DCB. The DCB state machine will empty the BFIFO prior to starting a read 
operation on the DCB. 

Slave device selection is made by the DCBADDR(3 downto 0) field of the DCBMODE register. No physical 
device attached to the DCB will ever be allowed to respond to the reserved DCBADDR = X"F". DCBADDR 
decoding for the Newport Graphics subsystem is as follows: 














DCBADDR Device 
0000 VC2 
CMAPO0 and 
0001 CMAP1 
(write only) 
0010 СМАРО 
0011 CMAP1 
XMAPO and 
0100 XMAP1 
(write only) 
0101 XMAPO 
0110 ХМАР1 
0111 RAMDAC 
1000 Video СС1 
1001 Video AB1 
1010 
to undefined 
1110 
1111 reserved 











Table 25: Newport Graphics DCBADDR Decoding 


The register to be accessed within the device selected by DCBADDR is determined by the DCBMODE 
DCBCRS(2 downto 0) field. If the DCBMODE ENCRSINC bit is set, then DCBCRS will increment following 
the transfer of each byte on the DCB. 


The protocol used to transfer data on the DCB is described by the ENASYNCACK and ENSYNCACK fields 
in DCBMODE. 


If ENASYNCACK is set, an asynchronous handshake protocol will be used to transfer data across the 
DCB. The asynchronous handshake protocol runs as follows: The REX3 will assert DCB_CS_N when it is 
presenting valid data (for write cycles) or ready to accept data (for read cycles). The slave device asserts 
DCB_ACK_N when it has accepted data (write cycles) or is returning the requested data (read cycles). 
When the REX3 detects that DCB_ACK_N, synchronized to the 33 MHz. GIOCLK, has been asserted, it 
will de-assert DCB_CS_N. The slave device will signal then de-assert DCB_ACK_N. The REXS will not 
begin another transfer on the bus until a synchronized de-asserted DCB_ACK_N has been detected. Data 
can be transferred at a peak rate of 1 byte/ 4 cycles. 


If ENSYNCACK is set, a synchronous handshake protocol will be used to transfer data across the DCB. 
When the REX3 is presenting valid data (write cycles), or is ready to accept data (read cycles), it will assert 
DCB_CS_N. When the slave device is accepting the data, or is returning the requested data, it will assert 
DCB_ACK_N in the current GIOCLK cycle. This protocol will allow the transfer of data at a peak rate of 
1 byte/cycle. 
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If neither ENASYNCACK nor ENSYNCACK is set, then data will be transferred at a rate determined solely 
by the DCBMODE cycle timing parameters CSSETUP(4 downto 0), CSWIDTH(4 downto 0), and 
CSHOLD(4 downto 0). DCB RW N, DCB ADDR, DCB CRS, and (for write cycles) DCB DATA will be 
valid for CSSETUP cycles prior to asserting ОСВ CS М. DCB CS N will then be asserted for 
(CSWIDTH + 1) cycles. ОСВ CS N will then be de-asserted for CSHOLD cycles prior to changing any of 
the DCB control signals. For read transfers, data will be sampled by the REX3 at the end of last cycle in 
which DCB CS Ni is asserted. 


The DATAWIDTH (1 downto 0) field describes the number of bytes to transfer from each word written to or 
read from DCBDATAO or DCBDATA1 when ENDATAPACK is cleared. It is used to simplify the transfer 
across the 01064 bus of 3-byte (RGB triplet) quantities packed into words. When ENDATAPACK is set, all 
bytes written to or read from DCBDATA will be transferred across the DCB 


The SWAPENDIAN bit, in conjunction with the DATAWIDTH field, is used to support the OpenGL 
SWAP ENDIAN pixel packing attribute. When set, the ordering of bytes within short and long width data is 
reversed. 


Once the DCBMODE register has been written to, subsequent reads to and writes from the DCBDATA 
register will result in data transfers on the DCB, using the specified timing and protocol. 


3.14 Chip Reset and Initialization 


Following reset, the REX3 assumes that it is attached to GIO64 bus that is physically 32 bits wide, and that 
the registered transceivers that isolate the pipelined GIO64 bus from the non-pipelined GIO64 bus are phys- 
ically present. If the registered transceivers are not present (as in the Sapphire system), the host must clear 
the EXTREGXCVR bit in the CONFIG register prior to performing any reads from ВЕХЗ registers. If the 
REX3 is attached to a GIO64 bus which is physically 64 bits wide, the BUSWIDTH bit in the CONFIG reg- 
ister should also be set at this time. If the REX3 is installed in a system with a GIO32 bus master, the 
GIO32MODE bit in the CONFIG register must be set. 
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4 System Interface 


4.1 GIO64 Bus Interface 


The REX3 is a pipelined GIO64 slave device. Тһе REX3 does not check or generate parity, so the GIO64 
bus parity signals, P_ADP(7 downto 0) and P VLD PARITY N, are ignored. Only the two least significant 
SLOT NUMBER pins from the GIO64 bus are brought into the REX3 for address comparison. The two 
most significant SLOT NUMBER pins are assumed to be B"11". This implies that the base address of the 
REX3 and associated Newport Graphics subsystem is at Х”1Ғ000000”, X”1F400000”, Х”1Ғ800000”, or 
X'1FC00000". Consequently, REX3/Newport Graphics subsystems may only be placed in GIO64 slots C, 
D, E or F. Multiple head operation (up to four displays) is achieved by populating slots С, D, E, and F with 
REX3/Newport Graphics subsystems. 

Two interrupts are returned from the Newport Graphics subsystem by REX3. VV_INT_N is the sum of 
the VERT_INT_N (vertical retrace) signal (from the VC2), and the VIDEO_INT_N signal (from the Express 
Video Option). The REXS will latch the occurrence of a falling edge on the VERT_INT_N input and assert 
the VV_INT_N interrupt. A ‘low’ level on the VIDEO_INT_N input will also result in VV_INT_N being 
asserted. The host determines the source of VV_INT_N by reading the STATUS register. When the 
STATUS register is read, VRINT, the latch associated with VERT_INT_N, is cleared, removing that contri- 
bution to VV_INT_N. User code which is not willing to service VERT_INT_N interrupts should read the 
USER_STATUS register, which does not clear the VRINT latch. 

FIFO_INT_N is generated whenever the number of entries in either the graphics fifo (GFIFO) or the dis- 
play control bus fifo (BFIFO) has exceeded a programmed level for a programmed amount of time, or when 
the number of entries in either the GFIFO or BFIFO has fallen below a programmed level. ВЕХЗ fifo inter- 
rupt behavior is therefore determined bythe CONFIG register BFIFODEPTH, GFIFODEPTH, BFIFO- 
ABOVEINT, GFIFOABOVEINT, and TIMEOUT fields. Whenever a їо above’ interrupt is generated, this 
occurrence is latched in either the STATUS register BFIFO_INT or GFIFO_INT field. Reading the STATUS 
register will reset these bits, but FIFO_INT_N will remain asserted as long as the interrupting condition 
exists. The latching ‘fifo above’ status is intended to provide the host with a means of identifying the source 
of spurious interrupts. User code should only read status from USER_STATUS, to prevent the uncontrolled 
clearing of the BFIFOABOVEINT, GFIFOABOVEINT, and VRINT interrupt status bits. 

The REX3 will operate as a pipelined GIO64 slave with or without the presence of external registered 
transceivers. The REX3 assumes that the external registered transceivers that define the pipelined GIO64 
bus are present. The absence of the external registered transceivers is communicated to the REX3 by рго- 
gramming CONFIG register EXTREGXCVR bit to B"0". When ВЕХЗ is installed in a system without external 
registered transceivers, this bit must be programmed to B"1" prior to any read operation. 

The REX3 will respond to both 64-bit and 32-bit wide GIO64 bus masters, as determined by the 
P GSIZE64 signal. The REX3 will operate with а GIO64 bus that is physically either 64 bits wide or 32 bits 
wide, as determined by the CONFIG register BUSWIDTH bit. 

The REX3 will follow the GIO32 bus protocol when the CONFIG register GIO32MODE bit is set. In this 
mode, data transferred during the GIO bus byte count cycle will be interpreted according to GIO32 protocol 
convention. 

The REX3 supports both little-endian and big-endian addressing conventions in GIO64 mode. 

Please consult the GIO64 Bus Specification, and the Graphics IO (GIO) Bus Specification for precise 
descriptions of the GIO64 and GIO32 protocols. 


4.2 Display Control Bus Interface 


The Display Control Bus (DCB) is an 8-bit, 33 MHz bus controlled by the REXS3, interfacing the REXS to 
the ХМАР55, VC2, CMAPs, RAMDAC, and Video Option in the Newport Graphics subsystem. In addition 
to the 8 bidirectional data lines (DCB DATA(7 downto 0)), the bus includes 4 device address lines 
(DCB ADDR(3 downto 0)), driven by the REX3, which are externally decoded to produce 15 device chip 
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select signals. The bus also includes 3 command/register select lines (DCB_CRS(2 downto 0)), allowing 
eight registers to be accessed within each device. А data transfer direction line (DCB_RW_N), a command 
strobe line (DCB_CS_N), and an acknowledge signal (DCB_ACK_N) complete the set of bus signals. 

All signals on the DCB sourced by the REX3 (DCB_DATA, DCB_ADDR, DCB_CRS, DCB_RW_N, and 
DCB_CS_N) change on the rising edge of the 33 MHz GIO СІК. All inputs (DCB_DATA and DCB ACK N) 
are sampled with the rising edge of GIO CLK. 

DCB ADDR(3 downto 0) = X"F" is reserved as а null-device chip select. No physical device is allowed 
to respond to transactions to this reserved address. 

The DCB supports different slave device timing requirements, synchronous and asynchronous opera- 
tion, and data transfer protocols with or without acknowledgement. These different modes of operation are 
programmed through fields in the DCBMODE register. 

Driver conflict and bus contention are avoided by having the REX3 insert at least two idle cycles 
(DCB DATA tri-stated, DCB CS N = В”1”, DCB_ADDR = Х”Р”) between transactions of different directions 
(read followed by write, or write followed by read), and between transactions to or from different slave 
devices. As the DCB ACK N signal may be shared by multiple deveics, a slave device must return this 
signal to the inactive ("1") state prior to tri-stating its driver. 

Please consult the Display Control Bus Specification for precise descriptions and definitions of the 
Display Control Bus protocol. 
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4.3 VRAM Interface 


The memory controller runs at 66МН2. and is made ої 5 state machines. Following are the four state 
machines and their respective functions: 


1. CONTROL MODULE Controls the other state machines 

2. TR_FSM Performs screen refresh and memory refresh 

3. LD REG FSM Loads write mask and color regs. in RB2 and Vrams respectively 

4. WRITE FSM Performs write only operations to Vram (including block writes) 

5. RMW FSM Performs read, read modify write and read/read modify write 
cycles. 


The WRITE FSM and RMW FSM state machines keep the Vrams іп page mode unless a page miss or a 
request to transfer to another state machine is requested. LD REG РМ is invoked when the host reads 
the chip is not busy and wants to load a new write mask or new color register value in the Vrams. 


When a screen refresh request is made, the state machines make sure there are no pixels in the data pipe 
before honoring the request. The memory controller in all four banks operate independently from each other. 


A page mode cycle takes 4 (66МН2) clocks. A full ras cycle is 11 clocks. 


Various timing for the frame buffer is shown in the next few pages. 
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4.4 Tester Interface 


All bidirectional pins and all tri-stateable output pins are placed in their high impedance state when the ТЕІ 
pin is driven low. 


The JTAG_TMS pin, when driven low, enables the scan chain mux input into each storage element. When 
TP(1:0) = “11”, the JTAG_TCK pin is selected as the clock input to all REX3 flip-flops. JTAG_TDI is the scan 
input into the first storage element of the scan chain. Whenever ТР(0) = “1”, the output of the last (4756th) 
element in the scan chain is muxed onto the JTAG_TDO pin. The first 229 elements in the scan chain are 
the flip-flops which drive REX bidirectional and output pins, and their output enables. The first 229 ele- 
ments in the scan chain are: 


RO Y DISP 0 
RO Y DISP 1 
BANK A TRI-STATE OE N ('0' enables outputs for the next 34 pins) 


0. 
RB2 DATA A 0. 
VRAM WBWE N A 
VRAM DTOE МА 
VRAM DSF1 A 
RB2 SEL A 0 
RB2 SEL A 1 
RB2 SEL A 2 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADDR ; 
VRAM ADD 
VRAM ADD 
VRAM RAS А 
VRAM CAS A 0 
VRAM CAS A 1 
RB2 DATA A 1 


| 
+o N oO 


DPPH D> D> 


OSSSSSSUSS 
o» olor. 


> 


22222 
UJ UJ UJ (9 UJ 
T I ГӘ o № 
9999999 
>>>>> 
>>>>>> 
слою о 


RB2 DATA А. 
RB2 DATA , 
BANK B ТНІ- 


RB2 DATA 


> 


1 
1 
STATE OE_N ('0' enables outputs for the next 34 pins) 
0 


222 
UJ U UJ 
NAN, 
yy 
>>> 
шашу» 
о 


== 
со 
N n3. 
SS 
Za 
>> 
wo 
оо 
NOUA 
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| 
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ое 


VRAM WBWE № B 
VRAM DTOE N B 
VRAM DSF1 B 
RB2 SEL B 0 
RB2 SEL B 1 

RB2 SEL B 2 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_RAS_B 
VRAM CAS B 0 
VRAM CAS B 1 
RB2 DATA B 1 0 
RB2 DATA B 1 1 
RB2 DATA B 1 2 


555555545 


wD CO CD G7 w ооо 


оқуын сого! alo 
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5 
RB2 DATA C 0 6 
RB2 DATA C 0 7 


VRAM DSF1 C 
Ree SEL C0 
RB2_SEL_C_1 

RB2 SEL C_2 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM_ADD 
VRAM ADDR ' 
VRAM RAS C 
VRAM CAS C 0 


RB2 БАТА. 
RB2 DATA ! 
RB2 DATA C 1 : 
RB2 DATA ! 
RB2 DATA 
RB2 DATA 
RB2 DATA C 1 - 
RB2 DATA C 1 
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TE OE N ('0' enables outputs for the next 34 pins) 


BANK D TRI-STATE OE М ('0' enables outputs for the next 34 pins) 
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> 
ос 
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оч 


RB2_DATA_D_0_ 
VRAM_WBWE_N_D 
VRAM_DTOE_N_ 
VRAM_DSF1_D 

RB2 SEL D 0 

RB2 SEL D 1 

RB2 SEL D 2 

VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADD 
VRAM ADDR. 
VRAM ADD 
VRAM ADD 
VRAM RAS D 

VRAM CAS D 0 
VRAM CAS D 1 
RB2 DATA D 1 
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ОСВ TRI-STA 

DCB DATA 0 
DCB DATA 1 
DCB DATA 2 
DCB DATA 3 
DCB DATA 4 
DCB DATA 5 
DCB DATA 6 
DCB DATA 7 
DCB CRS 0 


E N ('0' enables outputs for the next 17 pins) 
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DCB_ADDR_0 

DCB_ADDR_1 

DCB_ADDR_2 

DCB_ADDR_3 

VR_INT_REG (if this bit is set, VV_INT_N will be asserted) 
VIDEO_INT_D_REG (if this bit is set, VV_INT_N will be asserted) 
FIFO_INT_N (if this bit is clear, FIFO_INT_N will be asserted) 

S19 BUS TRI-STATE OE ('1' enables outputs for the next 65 pins) 


моо оогоо моо омо а а — a —ъ —ъ — (О Со +O Ол ог — 
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D 44 
D 45 


D_56 
D_57 


"WWW WWW NU UU UNO КЫ UU UU UU D U U UU UU UU D UU UU 
Boo oO ORO OS eae erg a eer are Oooo OOO a aga a a EC CoD саванна ака лан 
= 
>< 
BOS 
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When TP(1:0) = “10”, the parametric nand-tree/process monitor output is muxed onto JTAG_TDO. All bidi- 
rectional pins and all signal input pins (except for GIOG4CLK, PLL RESET М“, TP 0, and ТР 1) аге con- 

nected to the paremetric nand tree. УС TX НЕО is the first pin in the nand tree, and the rest of the tree is 
connected in increasing LSI pin number order. 
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When TP(1:0) = “00”, the output of the VCO ripple counter is muxed onto JTAG_TDO, allowing testing of 
PLL VCO.Three pins are dedicated for scan chain based testing of the internal logic. SCAN_EN enables 
the scan chain mux input into each storage element. SCAN_IN feeds the first storage element in the scan 
chain. SCAN_OUT, which brings the scan chain off chip. The on chip Phase Lock Loop requires 5 pins for 
testing: PLL_TSTMD, which places the PLL into Test Mode, PLL_TCLK, the PLL test mode clock input, 
PTREE_PLLTCKO, the PLL test mode clock output. 
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5 Architectural Description 


5.1 СІО64 Bus Interface 


The REX3 is a slave device on the pipelined GIO64 bus. The GIO64 bus interface is the functional block 
of the REX3 that responds to data transfer requests from a GIO64 bus master. Commands and data are 
sent to the graphics pipeline through the graphics fifo (GFIFO), commands and data to the graphics back- 
end devices (VC2, XMAP, CMAP, RAMDAC, Video Option) are sent to the Display Control Bus Interface 
through the backend fifo (BFIFO). A block diagram of the GIO64 bus interface follows. 


GIO64 Bus 








GIO CONTROL 








Byte/Short/Word 
Swap 























GFIFO CONTROL GFIFO BFIFO 
78 bits x 32 words BFIFO CONTROL 
73 bits x 16 words 


DDA Section DCB Interface Section 


FIGURE 5. GIO64 Bus Interface Block Diagram 
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5.2 Display Control Bus Interface 


The Display Control Bus Interface section of the REX3 unpacks DCBMODE and DCBDATA information 
from the BFIFO, and uses this information to execute data transfers on the Display Control Bus (DCB). 
When DCBMODE data is unpacked from the BFIFO, the operating mode (DCB protocol and timing, DCB 
slave device address, DCB slave register address) of the DCB state machine is defined. Data written by 
the host to the DCBDATA register is unpacked from the BFIFO, and sent out on the DCB by the DCB state 
machine, using the defined DCBMODE. When the host performs a read of the DCBDATA register, a DCB 
read request is pushed onto the BFIFO by the GIO interface. When the DCB read request is unpacked from 
the BFIFO, the DCB state machine will execute the read data transfer cycles on the DCB, packing the data 
it receives into the BFIFO. The GIO interface will then transfer the requested data from the BFIFO back to 
the host. A block diagram of the Display Control Bus Interface follows: 








BFIFO Control BFIFO 























DCB State Machine 





Output Registers Input Registers 














DCB_ADDR ЗР 
(3 downto 0) CRACK: DCB_DATA 


DCB CRS (7 downto 0) 
(2 downto 0) 
DCB_CS_N 
DCB RW N 


FIGURE 6. Display Control Bus Interface Block Diagram 
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5.3 DDA Unit 


The DDA unit assembles the drawing context, by unloading commands and data from GFIFO, and executes 
drawing primitives. The main components of the DDA unit are the current drawing context registers, the 3- 
stage shade and address generation pipelines, and the host pixel swizzle and pack logic. 


The top-level VHDL block is named DDA ТОР. Figure 7 shows the major functional blocks. 
























































































































































































































































GFIFO To GIO ReadRegister Mux 
* NA 
contol address data register read pixel read 
У 
v 
CONTEXT CTL — y> Context Registers = 
ii n state smasks | color, slope | pattern 
RD_PACK 
END CHECK 14- ADR. SETUP == 215 PATTERN | >> m 
+ SHADERS А 
ADR_ITERATE SMASK ALIGN 
X,Y address RGBA0 RGBA1 
D СТІ 
v v v Y Y 
ADR_SCR_ALIGN SMASK_CLIP X_ALIGN 
X,Y address RGBA0 RGBA1 
E СТІ a 
О 
А ` | 
9 x _ 5 
d E 2 in a 
RFIFO control сс > e Ы» | 
о = | < < o 
2 2 B 8 © a 
š o < © т m g 
< 
=] Y Y v у ту. 
X| X  FGBG FSTCLR AA МОХ /46------- 
vy ROBA0, V 2 X 
WR SWIZZLE 
framebuffer read pixels 
RGBA(0:3) | 
AU X RD. SWIZZLE 
PR SÉ А 
: . framebuffer 
FIFO FULL X,Y address DAT. VAL(A:D) RGBA(0:3) register control bits read pixels 
v v Y v 
To 
BANK FIFO(A:D) Memory RFIFO(0:7) 
Controller 


FIGURE 7. DDA TOP block diagram. 
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5.3.1 DDA_TOP Port List 


Figure 8 shows the port diagram for DDA_TOP. There are six functional interfaces: (1) The GFIFO section 
unloads GFIFO data and control into DDA_TOP next context; (2) The VRAM BANK FIFO interface loads 
the memory subsection BANKFIFOs with pixel address, data, and control bits; (3) VRAM CTL routes control 
bits from DDA TOP non-pipelined registers to the memory controller; (4) VRAM READ FIFO consists of 

DDA TOP RFIFO data input and handshake; (5) READ DATA routes DDA TOP packed GIO read data to 
the GIO interface; (6) The STATUS signals communicate idle status between blocks. Table 26 outlines the 


function of DDA TOP ports. 


SRESET 
CLK 33MHZ 


GF. ADR(6:0) 

GF. DATA(63:0) 

ОЕ READ 

GF D32 

GF. WDMAMODE 
GF. STARTBYTE(2:0) 
GF GO 


GF EMPTY ————» _ 


WF. FULL. (A:D) 


RF. PIX. (A:D)(0:1)(31:0) 
RF ЕМРТҮ (А:0)(0:1) 


GIO HOSTRD POP 


M PIX PIPE EMPTY 



































— p 
— p 
DDA_TOP 
———— — 
GFIFO 
р- — 
— p 
ppt 
а te 
— p 
— p 
| = 
— p 
——- 
— -- 
10 
oT 
VRAM BANK FIFOs 
—— A > 
+ в» 
2 
VRAM CTL x > 
(NON-FIFO) 
80. в 
»- 
VRAM READ FIFO 8 
j> pæ 
— m 2 »- 
READ DATA 64 
= cd 
Еи 
— STATUS — p 





FIGURE 8. DDA_TOP port diagram. 


DDA_GF_POP 


DDA_WF_WR (A:D) 
DDA_WF_DAT_VAL_(A:D)(0:1) 
DDA_WF_READ 

DDA WF W X (A:D)(1:0) 

DDA WF W Y(1:0) 

DDA WF. X (10:3) 

DDA WF Y (9:0) 

DDA WF PIX (A:D)(0:1)(31:0) 
DDA_WRMSK_COLORREG(23:0) 
DDA_COLORREG_LD, DDA_WRMSK_LD 
DM1_PLANES(2:0), DM1_DRAWDEPTH(1:0), 
DM1_DBLSRC, DM1_COMPARE(2:0), 
DM1_RGBMODE, DM1_DITHER, 
DM1_FASTCLEAR, DM1_BLEND, 
DM1_SFACTOR(2:0), DM1_DFACTOR(2:0), 
DM1_BACKBLEND, DM1_LOGICOP(3:0), 
DM1_BLENDALPHA, CM_CIDMATCH(3:0), 


TOPSCAN(9:0), COLORBACK(31:0), 
ALPHAREF(7:0), DDA_AALIAS 


DDA RF POP (A:D)(0:1) 


DDA RD DAV, DDA PIO DAV 
DDA REG DATA(63:0) 

DDA HOSTRW(63:0) 

DDA GFX BUSY 
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Signal Name Type | Active Description 
SRESET | Н Global synchronous reset. 
CLK_33MHZ І 33MHz chip clock (GIO clock). 
GF_ADR(6:0) | GFIFO GIO address bits used for register decode: GIO ADR(9:8,6:2). 
GF_DATA(63:0) | GFIFO write data. 
GF_READ | Н GFIFO read/write flag. 
GF_D32 | H GFIFO data width: 1=32-bit, 0=64-bit transfer, (from GIO BC/SB.) 
GF_WDMAMODE l H GFIFO DMA mode enable. 
GF_STARTBYTE(2:0) І GFIFO GIO bus START BYTE field. 
GF СО | H GFIFO GO bit, decoded from GIO address. 
GF ЕМРТҮ | Н GFIFO empty flag. 
DDA GF РОР О Н GFIFO read strobe. 
WF_FULL_(A:D) І Н Bank FIFO ІШІ flags. 
DDA WF WR (A:D) O H Bank FIFO write strobes. 
DDA_WF_DAT_VAL_(A:D)(1:0) О H Bank FIFO individual pixel valid flags, two per bank. 
DDA_WF_READ О H Bank FIFO read flag. 
DDA_WF_W_X_(A:D)(1:0) О Bank FIFO window relative X address 1505, for dithering. 
DDA WF W Y (1:0) O Bank FIFO window relative Y address Isbs, for dithering. 
DDA WF X(10:3) О Bank FIFO screen relative X address. 
DDA WF Y(9:0) О Bank FIFO screen relative Y address. 
DDA_WF_PIX_(A:D)(0:1)(31:0) O Bank FIFO RGBA pixel data (8-8-8-8). 
DDA WRMSK COLORREG(23:0)) О Muxed pixel writemask and VRAM color register data. 
DDA COLORREG LD O H VRAM color register write strobe. 
DDA_WRMSK_LD O H VRAM writemask write strobe. 
DM1_PLANES(1:0) О DRAWMODE1 VRAM planes enabled for R/W access. 
DM1_DRAWDEPTH(1:0) О DRAWMODE drawn pixel depth. 
DM1_DBLSRC O DRAWMODE1 double-buffer mode pixel read source buffer select. 
DM1_COMPARE(2:0) О DRAWMODE1 condition specifier for color compare function. 
DM1_RGBMODE O H DRAWMODE1 НОВ (vs. color index) enable. 
DM1_DITHER О H DRAWMODE1 dither enable. 
DM1_FASTCLEAR О H DRAWMODE1 pixel FASTCLEAR write mode enable. 
DM1_BLEND О H DRAWMODE1 blendfunction enable. 
DM1_SFACTOR(2:0) О DRAWMODE1 source blending factor. 
DM1_DFACTOR(2:0) О DRAWMODE1 destination blending factor. 
DM1_BACKBLEND О H DRAWMODE1 COLORBACK destination blend enable. 
DM1_BLENDALPHA O H DRAWMODE!1 source alpha/1.0 blendfunction select for source alpha 
DM1 LOGIC  OP(3:0) O DRAWMODE1 logical operation type. 
CM_CIDMATCH(3:0) O CLIPMODE CID check compare code. 


Table 26: DDA_TOP port descriptions. 
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Signal Name Type | Active Description 
TOPSCAN(9:0) O Y address for top of screen scan line. 
COLORBACK(31:0) O Destination blend color when DM1_BACKBLEND-1. 
DDA_AALIAS O H Enables anti-alias mode (from DRAWMODEO OPCODE). 
RF_PIX_(A:D)_(0:1)(31:0) l RFIFO data output (framebuffer read). 
RF_EMPTY_(A:D)(0:1) І H RFIFO empty flags. 
DDA_RF_POP_(A:D)(0:1) O H RFIFO write strobes. 
GIO_HOSTRD_POP | H GIO read acknowledge strobe. 
DDA_RD_DAV O H GIO register/DMA read data available flag. 
DDA_PIO_DAV О Н GIO PIO read data available flag. 
DDA_REG_DATA(63:0) О Context register read bus. 
DDA_HOSTRW(63:0) O HOSTRW read bus. 
М РІХ РІРЕ ЕМРТҮ | H VRAM controller pixel pipe idle flag. 
DDA_GFX_BUSY О H Graphics pipeline idle: М РІХ РІРЕ MT & DDA pipeline idle. 





Table 26: DDA_TOP port descriptions. 
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5.4 VRAM Controller Unit 


The frame buffer controller is made up of four independent memory controllers (A thru D).Each 
memory controller controls two sub banks (А0, А1, B0....etc.). Following is a description of how the banks 
are interleaved. 


5.4.1 Vram Interleave 


The frame buffer is made up of 8 Vrams for the base system and 24 Vrams for the upgraded sys- 
tem. Each of the four banks has two sub banks that interface to 1 to 3 Vrams data port. The address is 
shared between two sub banks as is RAS, WBAWE and DT/OE. Each sub bank has its own CAS signal. 
Since the frame buffer supports 2 MBit Vrams, (512 x 512 x 8 array) there are two scan lines in each row of 
the Vram. Each sub bank holds the adjacent pixel along a scan line. In order to achieve a high writing rate 
for line drawing, the frame buffer has been scrambled as shown in Table 28. The left two columns show the 
Vram page number and the scan line number. the rest of the table shows which pixels are affected by which 
bank. Figure 6 shows the X vs. Y interleave of the frame buffer. The scan line packing is shown in Table 32. 
This interleaving format (making the frame buffer isotropic) lends itself to fast writing rates. e.g. for vertical 
lines, two pixels reside in the same Vram page and sub-bank, therefore, between all four banks REX3 can 
write 8 pixels in one full memory write cycle time + one page mode cycle time. All four banks write the two 
pixels in parallel. For lines of slope +/- 1, REX3 writes 8 adjacent pixels, one in each bank. For horizontal 
lines every other bank is accessed in parallel and in page mode. For spans, all banks are accessed in par- 
allel and in page mode. This holds true when drawing lines and spans in reverse direction (i.e L-R Vs. R-L 
etc.)as well. 


Table 27: Frame buffer format 





Page3 | S-line7 | DO | D1 | AO | Al | BO B1 | CO | CI 
Page 2 | S-line 6 
Page 1 | S-line 5 
Page 0 | S-line 4 
Page 3 | S-line 3 
Page 2 | S-line 2 














Page 1 | S-line 1 
Page 0 | S-line 0 























Interleave 0 1 2 3 4 5 6 7 


5.4.1.1 Aux/Pixel plane Interleave 


In addition to the frame buffer interleave shown in Table 1, the AUX planes are also interleaved 
with the pixel planes as shown in Table 2. Starting from column address 0, the first four bytes are AUX 
planes followed by 8 bytes of pixel planes. There are two 2 AUX values per byte. Due to the 8 way inter- 
leave, the two AUX’s are for pixel 0 and pixel 8 in the first byte of the row 0, etc. All operations on the AUX 
planes are read modify write operations. This is due to the fact that not all operations will be MOD2 aligned 
in the X-axis and there is considerable overhead in performing multiple write mask operations for end point 
conditions. Only one of the AUX planes (CID, OLY or PUP) planes can be written at any time. 
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5.4.1.2 Writemask 


software has to be aware of the frame buffer formats and set the appropriate write mask pattern. 
In the case of writting one of the AUX planes, software does not have to worry about the pixel packing ( i.e 
two pixels residing in one memory location) since the memory controller performs a read on both the pixels 
and modifies only the necessary one while keeping the other pixel in an unmodified state and then writes 
both the pixels back. 


5.4.2 Vram address generation 


Тһе frame buffer is architected as 4 way interleave memory. Each interleave has two subbanks. 
The multiplexed Vram address is shared between the two banks. Since the pixel and aux planes are inter- 
leaved in the Vram, the column address for the aux planes is different from the pixel planes. For any pixel, 
the corresponding aux planes are in the same Vram and same page. Only the column address varies. The 
address calculation for pixel and aux as implemented in hardware is shown below. 


5.4.2.1 Pixel planes column address 
Y2 && [((X10 - X3)DIV8 x 12) + 4 + (X10 - X3)MOD8] 


5.4.2.2 Aux planes column address 
Y2 && [((X10 - X3)DIV8 x 12 ) + (X10 - X4)MOD4] 


5.4.2.3 Row address 
(Y9 - ҮЗ) && (Y1 - YO) 
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Table 29: NEWPORT MEMORY CTRL PIN DESCRIPTION 


















































Signal Description Type Timing 
WR BANK O 

_ WR FIFO (AD Write command for bank fifos MHz 
E DAT VAL (АШ) (0:1) Data valid in bank fifo | 33MHz 
E SRC DAT (A:D) (0:1) | Source data written into bank fifos RGBA 8 Bits each. | 33MHz 
_(31:0) 
E_READ Pixel read or write command | 33MHz 
E X(10:3) Screen X coordinates DIV8) | 33MHz 
E Y(9:0) Screen Y coordinates | 33MHz 
E W X (A:D) (0:1) Window relative X bits for dithering | 33MHz 
E W Y (0:1) Window relative Y bits for dithering | 33MHz 
RDFIFO_FULL_(A:D)_(0: | Read fifo full indicator | 
1) 
M WR BANK FULL (A: | Write bank fifo full О 33MHz 
D) 

OBA NA 

H ynchronous reset Тог disabling drivers and test tatic 
A ALIAS Anti alias bit | Static/33 
ENDITHER Enable dither | Static/33 
VC_TX_REQ Transfer request from VC | Async 
VC SET TSC Set transfer line 4 to TOPSCAN Reg. | Async 
LD COLR REG Load color reg. in Vram command (pulse) for block mode | 33MHz 
LD WR MSK Load write mask reg in RB2 command (pulse) | 33MHz 
MSK_COLOR_REG(23:0) | Write mask/color reg data for LD COLR REG or LD WR MSK com- | Static/33 

mands 
TOPSCAN(9:0) First scanline for screen refresh (top of screen) l Static/33 
CIDMATCH(3:0) 4bits. One of four CID values to match. | Static/33 
PIXDEPTH(1:0) 4, 8, 12, or 24 bits/pixel | Static/33 
COMP(2:0) Color compare function If “111” => color comp disabled | Static/33 
VREFRESH(2:0) # of memory refreshes to follow a vram transfer cycle | Static/33 
PLANES(1:0) Planes to access | Static/33 
SFACTOR(2:0) Source factor for blend function | Static/33 
DFACTOR(2:0) Destination factor for blend operation | Static/33 
COLORBACK(32:0) Background color to be blended for textures | Static/33 
ENBACKBLEND Enable background color blend | Static/33 
ALPHAREF(7:0) Reference alpha for AFUNCTION test | Static/33 
BLEND Enable the blend function i Static/33 
BLENDALPHA Blend source alpha with alpha i Static/33 
LOGIC_OP(3:0) Logic operations to be performed on src/dst pixels | Static/33 
DBLBUF Double buffer mode | Static/33 
DBLSRG Read source buffer for double buffer mode | Static/33 
FGASTCLEAR Vram blockfill mode (for writing color reg into frame buffer | Static/33 
RGBMODE RGB mode Vs СІ | Static/33 
М РІХ РІРЕ MT Pixel pipe empty O 66MHz 
M Y DISP(1:0) LSB's of screen refresh to XMAP9 for frame buffer descramble О 66МН2 
[ ^ ID M. TC FRAMEBUFFER SIGNALS” гр] 

REX3 DAT IN (A:D) (0: гате buffer read data (7:0) via RB 66MHz 
) (7:0) 
RAS (A:D Vram Row Address Strobe О 66MHz 
CAS _(A:D)_(0:1) Vram Column Address Strobe O 66MHz 
DTOE_N__(A:D) Vram transfer/ output enable signal O 66MHz 
WBWEN .(A:D) Vram write control signal O 66MHz 
DSF1_(A:D) Vram special function signal O 66MHz 
FA) (A:D) (8:0) Vram multiplexed address O 66MHz 
REX3_DAT_OUT_(A:D)_( | Frame buffer write dat(7:0) via RB2 О 66МН2 
0:1)_(7:0) 
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Table 29: NEWPORT MEMORY CTRL PIN DESCRIPTION 


Description 


Type 


Timing 





Буїе select tor Rb 


66MHz 





O 


66MHz 





Output enable for bidirect data drivers 




















М RD DATA (АФ) (0: гате buffer read data (31:0) to read fifo’s O 66MHz 
_(31:0) 

M WR FIFO (A:D) (0:1) | Write command for M RD DATA О 66MHz 
RDFIFO_FULL_(A:D)_(0: | Read fifo full (one per bank) | 66МН2 


1) 











5.3.2 Memory Controller 


The memory controller for Newport is implemented as 4 independent bank controllers. The 4 controllers are identical to 
each other and operate independently from each other. Figure 1. shows the top-level of the frame buffer controller. The 
frame buffer controller includes all the dither, logic-op, blend, cid check, Afunction test, color compare, read format and 
write format functions. There is one copy of each function in each sub bank except for the blend unit which is shared 

between to sub banks.The pin description of the controllers (A thru D) is shown in Table 1. Each bank has its own data 
and address port as well as various control signals for RB2 and Vrams. The write bank fifos are incorporated into each 
bank while the read fifos reside in the DDA section. 


Each bank has two sub banks (0 and 1) so for bank A references to the sub banks will be made as A0 and 
A1. The address port is shared between the sub banks of a bank while they have their own 8 bit bidirectional data ports 
interfacing to RB2's. The data ports serially send out 3 bytes of to assemble a 24 bit pixel in RB2 which interfaces to 
frame buffer. Reads from the frame buffer are serialized by RB2 and read on the 8 bit data port of REX3. The byte num- 


ber to read/write to RB2 is selected by RB2_SEL(0:1) generated from REX3. 


The memory controller of each bank operates the sub banks in parallel. Each bank has two valid bits so 
that the pixel write can be negated at the last moment. There are 6 functional state machine units in each bank, and 
one common general purpose unit (called general mc decode) which is common to all 4 banks. The general mc unit 
provides various decodes and also synchronized the screen transfer requests from VC2. The 6 state machine units in 


each bank are as follows: 


RMW_FSM 
WRITE_FSM 
LD_REG_FSM 
TR_FSM 
OUT_BLOCK 


“2290p 


CONTROL MODULE 


Figure 3. shows the connection of the various state machines. The Control Module is responsible for 
enabling the various state machines. With the exception of CONTROL MODULE and OUT_BLOCK, only one state 


machine is active at any time. 


The address pipeline is shown in Figure 2. Page comparison of the previous pixel address and the current 
pixel address is done at the first pipeline stage. If a page miss is encountered, further reading of the write bank fifo is 


inhibited and the previous two pixels are written before generating a precharge cycle for the Vrams. 


Non-persistent write mask feature of the Vrams has been used. The write mask is loaded into RB2 by assert- 
ing RB2_LDWMASK and the appropriate byte select # on the RB2_SEL(0:1) lines. RB2 detects RAS being at a logic 
high and asserts the write mask value onto the frame buffer data bus. Each bank has a common RAS signal for the sub 
banks and two CAS signals, one for each sub bank. For lines, only one of the sub banks is operated on whereas for 
spans both banks are operated on simultaneously and their CAS signals are synchronous if Xaddress MOD2 is zero. 
For write only cycles, the controller performs early write cycles and for read modify write cycles, late write cycles are 
performed.The memory controller operates at 66MHz. Full page cycle takes 10 clks (150 nS) so 80nS or faster Vrams 
are required. The color reg in the Vrams is loaded by the memory controller. Block mode write cycle feature of the 
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Vrams is available for screen clear operations. In block mode, each bank can write 8 pixels in 4 clocks 
(60nS). 


Following is the pin description of each state machine unit and their respective state diagrams. 


5.3.3 VRAMS 

REX3 has been designed to work with VRAMS from the following vendors: 
Toshiba TC528257 
Mitsubishi M5M482256 
Hitachi HM538253 
Micron MT42C8255 
Fujitsu MB8128xx 
NEC uPD482234 
TI TMX55160 
Vitelic V53C851 
Samsung KM428C256 


All of the above have to be 70nS or faster. 
5.4 Scan Refresh Latency 


Scan refresh is initiated by asserting VC_TR_REQ signal. A falling edge is detected on this sig- 
nal which triggers a 480nS timer in REX3. Once the timer has timed out, if there are no more pixels in the 
pipe then the transfer state machine is invoked. When a falling edge on VC_TR_REQ signal is detected the 
display line number (initialized from VREFRESH reg.) first increments and then does a Vram serial read 
transfer cycle. During the timeout period another falling edge on VC_TR_REQ signal may be generated (do- 
ing so will not restart the 480nS timer) to increment the display line number, hence generating interlaced 
mode. The minimum time to realize a new scan line in Vram after asserting VC_TR_REQ is 650nS and the 
maximum time is 750nS. The timing constraints are shown in the diagram below. 








VC_TR_REQ 

^+ В >| 
С 

line N -1 line N | line N + 1 
A min - 30nS 


B min - 720nS 
C min - 650п5 тах - 75015 








Scan line 
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Table 30: GENERAL MC DECODES MODULE PIN DESCRIPTION 


SIGNALS DESCRIPTION TYPE 


| ӘАЕЅЕТ — — | Synchronous reset |) 








[DEST PIX OP — | Destination pixel is required (hence atleast RMW for frame bufe) — | O | 
ГРІХ PLANES — | Readwrtepxelpans 6 
[CID PLANES —[Readwwiecidpans — — ^ | O | 
[OLAY PLANES — [Readwwieovelaypanes | O | 
PUP PLANES — [Readwiepopuppanes — | O | 

E о 


NO PLANES No planes are selected for transaction 
TR. REQ(S3:0) Transfer request to control module. This is asserted after УС TX REQ rising 

edge. 
TR PENDING(3:0) Transfer request pending to control module. Asserted 500nS after TR АВЕО | O | 
M Y DISP (9:0) Current line being displayed. Updated by VC TR REQ [ ° | 
М РІХ РІРЕ МТ РІРЕ IDLE(3:0) AND'ed to indicate all MC banks are idle [| О | 


5.4.0.1 General decodes module 





This module decodes various global signals and provides information to the state machines as to which 
planes are being accessed currently. It also contains the line counter for screen refresh and communicates to the DDA 
section that all the banks are idle. The line counter increments every time the leading edge of VC_TR_REQ is 
detected. VC_SET_TSC resets the line counter to the value in TOPSCAN(9:0). This module is shared between al 4 
banks. VC_TR_REQ and VC_SET_TSC should have a minimum pulse width of 30nS 
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Table 31: CONTROL MODULE PIN DESCRIPTION 


Ет —À DESCRIPTION TYPE 


[somonosreset l í | T ‘(i 
MEM GE — Мето clock вем  — — — — — — — — L r À] 
[EN Сір СНК [сооко enabled — — — — à  —  — à  — ë  — Т 
[EN -CCOMP | Enable color compare —— — — o à ^  — ^ — ^ 1| | | 
[PXCPLANES — [Readwüepxelpans 17 
[CID PLANES —[Readwiecidplans — — — ^ | | | 
[OLAY PLANES — [Readwwieovelaypanes — ë — ^ ^ ТІ 
[PUP PLANES — [Readwilepopuppanes —  — ^ ^ — ^ — Т 
[NO PLANES — [Noplanesareseleciediormansacon = [| | | 
[DEST PX OP  [Destpelrequred 17 
[FASTCLEAR —  —[EnacibockwrlecydesofVram —— ^ — — — ^ | | | 
AMW 5ТАТЕ5(5) | State vector om RMWFSM_  ăŽ — $ ë ë  — | I | 
[BLEND авео оо | I | 
FREAD | еа тоте вао —ć  ăć | T | 
[LD СОГА REG  |loadcolorregcommandfromhos — —  ăž  ăž ë ëT r | 
[LD М/Н МӘК | Load write mask reg command from hot — | T | 
LTR PENDING | Transfer pending. Do transfer only after this signalis аве 
[LD DONE | Reg load due to LD СОГА REGorLD WR MSKisdoe  — 3. | | 
[FFIFO MT  -[Writebankffosempy — ІІ 
[HMW FIFO RD | Write bank fifo read command from RMW FSM (increment PP counter) — | T | 
[WFIFO RD White bank fifo read command from DMA FSM (increment PP counte) — | T | 
[DEC W PIX CNT _ | Decrement PP counter command from DMA FSM (ie px is witen) — | T | 
DEC R PIX CNT | Decrement PP counter command from RMW FSM (Le pixel is readwiten) | — 1 | 
[CID ADR — [SelctCiDaddressfortheaddessppe 057 
ТААМ | Mem cycle requires readiead/modiywie — [О 
RW | Memory cycle requiring read modify write I — ë o | 
[FAST READ | Pipelined read of FB {due to Scr2Scr or Host DMA/PlOTeadj x | O | 
FASTX | Destination logic-op cycles only —— Ož — | O | 
[MAIN DLE — |Mainstiemachineisinidestaie —  — | O | 


TR FSM Serial transfer has been requested 


Seri transfer has been request — — — — — — — — — 

[LD'REG FSM | Enable LD_REG FSM (do color or write maskregload) _  — | O | 
Enable RMW FSM (do pipeline reads or "r/mw type" mem cycles) 

[WRITE FSM — | Enable DMA WRITE FSM (do pipelined writes tothe FB) — | O | 
[SEL COLR REG [Muxseedforcoordta = í í í ^| Oo | 
[PIPE IDLE — — |Nomorepielsinthepipsorioprcess — | O | 
FIFO READ —  [|ANDorRMW FIFO RD N, ОМА FIIO RDLN č | O | 
[PIPZO)  — — -.|fwofpxesineppe. č =C O= ë O | 
[BUS REV | Indicator for bus reversalonnextpxel — — | o 


5.4.0.2 Control module 











The Control Module is responsible for enabling one of 4 state machines (write fsm, rmw. fsm, tr fsm or 
ld reg fsm). It also indicates to rmw fsm as to the type of cycle to perform. The PIP counter resides in this module as 
does the arbiter for screen/memory refresh. 
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Table 32: Address Pipe Pin Description 





Signals Description Type 

МЕМ CLK Memory сіос 

YAD ІМ(9:0) Y address from write bank fifos | 
XAD_IN(7:0) X address from write bank fifos. | 
M Y DISP(9:0) Display refresh address for showing the next line. | 
F DAT VAL(0:1) Data valid bits from write bank fifos. | 
FIFO_RD Fifo read strobe | 
LD_EN_B Load stage B in the address pipeline | 
LD EN C Load stage C in the address pipeline | 
CID_ADR Command to compute the AUX planes address | 
ROW_ADR Enable the row address to the Vram address port | 
TR FSM Indication of serial read transfer | 
ADDR3 Address X3 for selecting high/low pixel in AUX planes. O 
VRAM ADR(8:0) Multiplexed Vram address O 
P DAT VAL(0:1) Pipelined version of F DAT VAL(0:1) О 
PHIT Page hit indicator O 





5.4.0.3 Address Pipe 


The address pipe module contains the Vram address calculator and the address mux. This module is responsible 
for the address data path. The addresses have a 4 clock pipe line and can hold 3 different addresses at any time. The Vram 
page comparator also resides in this module. 
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Table 33: RMW_FSM PIN DESCRIPTION 








5.4.0.4 


RMW_STATES(5:0) State machine state bits 
RMW_PRECH_RAS Precharge in next clock for RMW_FSM 
DEC_R РІХ СМТ Decrement PP counter 


SIGNALS DESCRIPTION TYPE 
[SRESET  . | Synchronousreset — |] Y | 

MEM CLK Memory clock 66MHz. | 
PHIT Page hit form FB page comparator in ADDR_PIPE module | 
FAST X Decode for 8 bit CI and logicop with no cid chk (for X perf) | 
RMW Enables this state machine. Otherwise idle | 
RRMW Mem cycle requires read/read/modify write | 
BUS REV Bus reversal has occurred on current fifo read | 
RMW_FSM Enable the RMW state machine | 
FAST READ Pipelined read of FB (due to Scr2Scr or Host DMA read) | 
PIP(2:0) # of pixels left in pipeline | 
F FIFO MT Write bank fifo empty | 
RDFIFO_FULL_(0:1) | Read bank fifo full | 
RD_V3 Write bank fifo read delayed 3 clocks | 
DV Data valid for rmw_state = W | 
J6 RMW State J of the RMW FSM delayed by 6 clocks | 
TR REQ Refresh transfer request O 
RMW_FIFO_RD Write bank fifo read command O 

O 

O 

O 











RMW state machine 
The RMW state machine performs the following types of cycles: 
a. Read frame buffer 
b. Read modify write 


c. Read/read modify write 
d. Fast_X 


The read cycles can read any of bit planes in pipe line mode. The read data is written into the read bank fifos. 


Read cycles can be used for PixBlit or host read operations. 


The read modify write cycles can do the following: 

a. CID checked writes in the pixel planes 

b. Non CID checked blended writes in the pixel planes 

c. Non cid checked color compared writes in any of the planes 
d. Non cid checked writes in the aux planes. 

The Read / read modify cycles are used for the following: 


a. Cid checked color compared writes in any planes 
b. Cid checked blends in the pixel planes. 


The Fast X mode is used for destination Logic-op's only. 


Table 34: WRITE FSM PIN DESCRIPTION 






































SIGNALS DESCRIPTION TYPE 
| SRESET Synchronous reset І 
МЕМ СІК Memory clock 66МН2. | 
WRITE_FSM Enable WRITE_FSM | 
TR_REQ Hefresh-requestHerscreen-update | 
PHIT Page hit form FB page comparator in ADDR_PIPE module | 
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Table 34: WRITE_ FSM PIN DESCRIPTION 




















SIGNALS DESCRIPTION TYPE 
PIP(2:0) # of pixels left in pipeline І 
Ғ ҒІҒО МТ Write bank fifo empty | 
BUS REV Bus reversal due to current read o! fifo | 
W_FIFIO_RD Write bank fifo read command O 
W_STATES(4:0) State machine state bits O 
W_PRECH_RAS Precharge ras in next clock for WRITE_FSM O 
DEC_W_PIX_CNT Decrement PP counter O 


5.4.0.5 Write state machine 


The WRITE_FSM executes memory cycles only. These cycles exclude any destination pixel processing. Block- 
mode writes into the Vrams are also executed by this state machine. The PIP counter increments or decrements whenever 
the write bank fifo is read or written respectively. 
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Table 35: LD_REG_FSM PIN DESCRIPTION 


SIGNAL DESCRIPTION TYPE 


SRESET Synchronous reset І 
МЕМ СІК Memory clock 66МН2. 

















| 
LD REG Enable LD_REG FSM | 
SEL_COLR_REG Mux select for color data | 
LD REG STATES(3: | State machine state bits O 
0) 
LD DONE Register load done O 
5.4.0.6 Load registers state machine 


LD_REG_FSM is responsible for loading the color register of the Vrams as well as the writemask register in the 
RB2. SEL_COLR_REG indicates when the color register needs to be loaded in the Vrams, otherwise the writemask is 
loaded in the RB2. Neither of these registers are readable from the RB2 or the Vram. A copy of these registers exists in the 
REX3 register file. Since the read/write format logic is in the RB2, a copy of the PLANES(2:0), DRAWDEPTH(1:0), RGB- 
MODE, LOGIC-OP(3:0) and DBLSRC bits are sent along sub bank 1 data bus while the writemask data is sent along sub 
bank 0. Every time the write mask or above mentioned registers are modified, a load write mask command is executed to 
keep the RB2 up-to-date with the current context. The color data value for loading the color register in the Vrams is sent on 
both the sub banks at the same time, one byte at a time. 





8/13/93 page136 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





Table 36: TR FSM PIN DESCRIPTION 
SIGNALS DESCRIPTION TYPE 


SRESET Synchronous reset [ 


МЕМ СІК Memory clock 66МН2. 

















| 

TR_FSM Enable TR FSM | 

VREFRESH(2:0) Number of memory refreshes to do in burst refresh | 

TR_STATES(3:0) State machine state bits O 

TR_DONE Transfer and refresh done O 

REF ТС # of specified refreshes are done O 
5.4.0.7 Refresh 


The leading edge of transfer request is detected and a 480п5 timer is activated and the line counter is incre- 
mented. During the 480nS period, a second transfer can be asserted by VC2 so the line counter can be incremented again 
(for interlace mode). The timer will not be reset due to the second transfer request. At the end of the 480nS period a refresh 
request is made to the arbiter. The TR_FSM first does a read transfer cycle followed by zero to eight memory refresh cycles 


depending on the value of VREFRESH(2:0). 


VREFRESH(2:0) being “000” disables memory refresh. 
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Table 37: OUT BLOCK DESCRIPTION 














SIGNALS DESCRIPTION TYPES 
| SRESET Synchronous reset signal І 
МЕМ СІК Memory clock 66МН2. | 
TR_FSM Enables transfer fsm | 
ВЕЕ_ТС # of specified memory refreshes аге done | 
LD REG FSM Enables load register fsm | 
SEL COLR REG Load color reg. in Vram | 
LD STATES(3:0) LD_REGFSM state bits | 
WRITE_FSM Enables the write fsm | 
W_STATES(4:0) WRITE_FSM state bits | 
FASTCLEAR Perform Vram blockfill (see block write function of Vram) | 
RMW_FSM Enables rmw fsm | 
RMW_STATES(5:0) RMW_FSM state bits | 
RMW Read modify write cycles for rmw_fsm | 
RRMW Read/read modify write cycles for rmw_fsm | 
FASTX Special cycles for rmw_fsm to speed up X11 operations | 
PHIT Page hit indicator for frame buffer | 
W_PRECH_RAS Precharge ras in next clock for write_fsm | 
RMW_PRECH_RAS Precharge ras in next clock for rmw_fsm | 
PIP(2:0) # of pixels in the pipe | 
P DAT VAL(0:1) Pipelined data valid bit from write bank fifo І 
ЕМ ССОМР Enable color compare (also set for А function since comparators аге shared) | 
CCOMP_PASS(0:1) Color compare (or A function) pass bits | 
EN CID СНК Enable cid checking | 
CID_PASS(0:1) Cid pass bits | 
MAIN_IDLE Main state machine is idle | 
ADDR3 Address X3 from module addr_pipe1 | 
FIFO_RD Write bank fifo read signal | 
FAST READ Fast read type cycles for the rmw state machine | 
OLAY_PLANES Overlay planes are to be accessed | 
CID_PLANES Access CID planes | 
PUP_PLANES Access PUP planes | 
BLEND Blend function is enabled | 
RAS Row Address Strobe to Vrams О 
CAS(0:1) Column Address Strobe to Vrams O 
DTOE N Data transfer / Output enable signal to Vrams O 
WBWE_N Write per bit / write enable signal to Vrams O 
DSF1 Special function signal to Vrams O 
RB2_SEL (2:0) Byte select codes to RB2 O 
OE М Output enable for the data bus to RB2 O 
M_WR_FIFO_(0:1) Write strobes for the read bank fifos O 
GO DATA Signal to start the data flowing from the fifo read register thru the dither and out. O 
LD_EN_B Load stage B of address pipe О 
LD ЕМС Load stage C of address pipe О 
ROW_ADR Select row address for the address pipe O 
RD_V3 Write bank fifo read delayed 3 clocks O 
RD_V8 Write bank fifo read delayed 8 clocks O 
J6_RMW State J of RMW_FSM delayed by 6 clocks O 
DV Data valid for rmw state machine in state W О 
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5.4.0.8 Out block module 


The OUT_BLOCK module decodes the state bits of each state machine and generates frame 
buffer control signals. It also controls the address pipeline and generates go_data to the data pipe to con- 
trol data flow. RB2 controls are also generated here by decoding the state bits. The encoding of the 
RB2_SEL(2:0) is as shown in Table 38. 





RB2_SEL(2:0) | Function — | 





0 NOOP 
1 Write (4 components), also used to write LO pixel in AUX planes 
2 Write HI pixel in AUX planes (else if blend then hold data in sub- 


bank 1 output of RB2) 

Load write mask and partial DRAWMODE1 Reg. into RB2 

Read (4 components), also used to read LO pixel in AUX planes. 
Read HI pixel in AUX planes 

Read LO pixel CID bits for cid checking. 

Read HI pixel CID bits for cid checking. 


Table 38: RB2_SEL(2:0) Function codes. 





ч O O + с 





When performing a Load Write Mask operation, the write mask is sent on sub bank 0 data bus and the 
drawmode1 register bits are sent on sub bank 1 data bus to RB2. The drawmod1 registers will be sent in 
the following order starting with the 

LSB of the first byte: This is shown in Table 39. 


siti | вип | юш | Bto | вив) | вии) | Bit) | 





Blend |Ғасісіваг | ВОВтоде | Dblsrc Panes: 0) TENET :0) TETTE 0) 





Table 39: Transmission order of Drawmode? reg. bits to RB2 
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5.4.1 Gate Count 
Block Description Gate count | Total(4Banks) 
ADDRESS PIPE Address pipeline to VRAMs 950 3800 
LD REG FSM State machine for loading registers 80 320 
OUT BLOCK Output decode block 650 2600 
RMW FSM Read modify write state machine 400 1600 
TR FSM Transfer and refresh state machine 150 600 
WRITE FSM Write state machine 200 800 
GEN. MC DECODE General MC decoder 100 400 
CONTROL MOD Master state machine controller 200 800 
Grand Total - 2730 10920 
Table 40: Gate count of memory controller state machine and address pipe. 
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5.4.3 Pixel Processing Pipe 


There are two pixel pipes in each bank to process the two adjacent pixels before writing into framebuffer. 

The processes can be dithering, logicop, alpha blending, cid checking, color comparison and afunction. In 
order to reduce gate count in REX3, logical operation, write formating and read formating are performed in 
RB2. The block diagram of the pixel pipes in each bank is shown below. The colors and alpha’s of two adja- 
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cent pixels in the bank can be read from write fifo. Each color component and alpha are pipelined through 
dithering block and are collected and formated to frame buffer format in RB2 chip. The cycle time of pixel 

processing pipe is 15 ns. Processing 4 color components of a pixel takes 4 cycles(60 ns), which matchs the 
memory cycle time. When blender is enabled, the source color component blends with the destination color 
component through the blender pipe and then go through the dithering block. The color compare block not 
only compare the source color index with the destination color index but also compare the source alpha with 
reference alpha. Whenever the drawing contex is changed, REX3 loads write mask from pipe0 and some 
draw mode bits from pipe1 to RB2 to support RB2 for performing read write formating and logicop. 


5.4.3.1 Blender 


The blender performs the blending function Cb=Fs*Cs + Fd*Cd, where Cb is blended color, Cs is source 
color and Cd is destination color. The source factor Fs and destination factor Fd are defined in section 3.8.5 
Table 21 and Table 22. In order to reduce gate count, the equation Fs*Cs + Fd*Cd is decomposed to А + В 
+ (+/-C)*(D+/-E) so that only one multiplier is required. The following table shows all the source factor and 
destination factor combinations and the terms after decomposition. The block diagram of the blender is also 
shown in the next page. 
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Table 28: 


Decompose Fs*Rs+Fd*Rd to A+B+(+/-C)*(D +/- E) 





DF 


Fs*Rs+Fd*Rd 





0 

Rd 

Rs*Rd 
(1-Rs)*Rd 
A*Rd 
(1-A)*Rd 


Rs 

Rs+Rd 
Rs+Rs*Rd 
Rs+(1-Rs)*Rd 
Rs+A*Rd 
Rs+(1-A)*Rd 





Rd*Rs 

Rd*Rs+Rd 
Rd*Rs+Rs*Rd 
Rd*Rs+(1-Rs)*Rd 
Rd*Rs+A*Rd 
Rd*Rs+(1-A)*Rd 


(1-Rd)*Rs 
(1-Rd)*Rs+Rd 
(1-Rd)*Rs+Rs*Rd 
(1-Rd)*Rs+(1-Rs)*Rd 
(1-Rd)*Rs+A*Rd 
(1-Rd)*Rs+(1-A)*Rd 


A*Rs 

A*Rs+Rd 
A*Rs+Rs*Rd 
A*Rs+(1-Rs)*Rd 
A*Rs+A*Rd 
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5.4.3.2 Dithering 


The following dithering block implements the dithering and rounding algorithms described in section 3.8.2 
апа 3.8.3. 


I[7:4]00 
[7:0] I[7:4]000 — I[7:4]O [7:4] 0 


oca wat | -—— 
S[5:2] S[3:0] 


S[6:3] S[4:] F[10:7 Ter Sec овои 


[11:8] 
S[11:0] 


реса [== ШІ; ткір 


S_F[3:0] 


endither . (A<B) or О 
endiher .S_ F(a) MERE, 


12 





ditherout 
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5.4.3.3 Gate count of pixel processing pipe 


The following estimated gate count including the gates for ATPG. 


Table 29: Gate count of pixel processing pipe 








Block Description 
mem_data_io read and write registers , mux’s, cid check 
dither fifo read registers , dithering and rounding | 1260 X 8 
ccomp mux’s and color compare 200 X 8 
blender blender including multiplier 


pipe_ctrl pipe controller 540 X 4 





blend_ctrl global blendfunction selector 50 X 1 
write fifo 90x3 write fifo 1350X4 
read fifo 32x5 read fifo 800Х8 


5400 





GRAND TOTAL | overall pixel pipe gate count 
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6 Revision History 


0.1 First release 4-24-92 
0.2 5-4-92 
0.3 8-11-92 rws 


Linedraw: new opcodes (А EDGE replaced by T_LINE, B_LINE); endpoint filtering now enabled by 
DRAWMODEO bit ENDPTFILTER; new secondary pixel calculation register BRESS2 added, and linedraw 
algorithm updated to reflect this change; octant mirroring effect on minor axis fraction simplified to be func- 
tion (1.0 minus .frac) including case of .frac = zero; endpoint filtering algorithm completed; coverage func- 
tion BRESS renamed BRESS1, and its range behavior changed to [-1.0 thru 1.0]. 


Z buffered, antialiased linedraw: behavior modified so that ZPATTERN contains primary pixel zmask, and 
the LSPATTERN register contains secondary pixel mask, for cases of (A_LINE, T_LINE, B_LINE) with (EN- 
ZPATTERN апа ENLSPATTERN both asserted). 


Setup: overhead is one clock for quadrant calculation (span or block), four clocks for |_LINE, and eleven 
clocks for the F LINE, A_LINE, T_LINE, B_LINE cases. 


Pipeline stalls: a one-clock delay is added for case of X,Y end condition reached, due to H/W implementa- 
tion issues at this clock speed. 


Context Switching: host overhead added for handling the SLOPERED register; see Section 3.12 for more 
details. 


Off-Screen Memory: reduced to 64 pixels wide (at right of screen). 


Screen-to-Screen Moves: XYMOVE is now treated as offset to destination; also, the XYMOVE is interpret- 
ed based on YFLIP, so the S/W can treat it as being YFLIP-independent. 


0.4 8-19-92 Adrian 


Added the explicit request for reloading the AWEIGHT table every time the slope (e1) of an antialiased line 
changes (in effect AWEIGHT must be reloaded for every antialiased line or edge). 


0.5 11-04-92 rws ERATTA LIST etc. 


Z-Buffered Antialiased lines: modified so that zpattern is now used for bottom of clockwise-rotated line, 
whereas Ispattern is used for top. REX3 determines what is top or bottom. tline, bline functionality can be 
achieved via setting masks above for desired edge. However, tline,bline selection of top versus bottom is 
correct only for odd-numbered octants (1,3,5,7) and is reversed for even octants (2,4,6,8). 


Subpixel positioned lines: the setup algorithm is modified to handle case where first pixel, when selected 
as closest pixel center to tangent of line, is different than specified vertex (along the minor axis). This adds 
the E and G tests. The benefit of this is that pixel selection is exact and independent of vertex swapping. 


Bresd term is changed from s17.8 to s18.8 format to handle full range of values. 


Default octant is “111” where octant is defined as XMAJOR & XDEC & YDEC where $DEC indicates direc- 
tion of stepping of axis $ is in negative direction (if 21) or positive direction (if=0). There is one exception to 
this: if doing line setup, then minor axis $DEC is zeroed if (BRESINC1=0 and (major axis $DEC-1)). This 
is done so that horizontal or vertical edges will have opposing behavior for antialiasing when direction is 
reversed. 


BRESROUND interpretation is: msb pertains to octant 1, Isb to octant 8. 


Host must zero the color fractions in the COLOR$ registers whenever drawing with host data, or doing 
screen to screen moves. Else rounding may occur: not an issue for СІ. 


Addresses never to be written with GO: LSSAVE, LSRESTORE, SETUP. 
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COLORRED format change: if written to with DRAWMODE set at 12b Cl mode, the h/w will take assumed 
format of 012.9 and place it in a 012.11 field, shifting up by 4 and zero filling. Context restore must then be 
done with DRAWMODE not in 12b Cl mode. This new COLORRED format is a hack to handle certain con- 
straints imposed by floating point format (23b mantissa). 


Lines cannot be read back to host or атал as lines. 


Endpoint filtering coverage details: pixel coverage along major axis is 0x10 to Ox1f for each pixel. This val- 
ue is comprised of a start coverage and an end coverage. If current pixel contains the starting point, a value 
of [XSTARTFRAC xnor XDEC] yield starting coverage (note: xmajor case assumed here for discussion). If 
the current pixel does not contain the starting point, a value of OxOf is used. The ending point coverage is 
calculated similarly, but using value when current pixel is the endpoint of [XENDFRAC xor XDEC]. The pixel 
coverage is the sum of (starting coverage plus ending coverage plus one). The 5b result is used in the fol- 
lowing way: if coverage<3>=1, then the AWEIGHT coverage value selected by the minor axis antialiasing 
lookup is passed through unchanged as the coverage value; elsif coverage(2>=1, the AWEIGHT value is 
shifted right one position; and so forth to a minimum coverage value of AWEIGHT shifted down three po- 
sitions (divided by eight). This naturally handles the case of starting and ending points being within a single 
pixel square. 


Programming restrictions: never do a write with "GO" set to the following registers or fields, unless DRAW- 
MODE bit DOSETUP=1: 


xend (all formats), xymove, xywin, drawmode, octant; 


the issue is related to changing the end condition in the X direction having some latency within REX3. So, 
if the write does not change any bit affecting the X end condition, there is no problem. (from tarolli: xend is 
only used with GO for drawing flat shaded triangle spans; drawmode mostly written with DOSETUP=1;) 


Fastclear and CID checking: ВЕХЗ will disable FASTCLEAR mode if CID checking is enabled. Therefore 
host must setup fast clear operation by writing COLORVRAM and also setting up DRAWMODE and COL- 
ORI (for example) register. This is necessary because GL will not know if window system invokes CID 
check. Fastclear and dithering: host (GL) will not invoke FASTCLEAR if dithering is enabled. 


Screenmasks: Host must add XYWIN offset to SMASKS1-4; host must add XYMOVE offset to all SMASKs, 
as ВЕХЗ does not do this. 


Setting REXS color registers to value using sub-24b RGB color value: although the host may initialize color 
for drawing with a simple write to COLORI, this register only supports 8-8-8 B-G-R packing when in RGB 
mode (no problem for Cl mode). In order for host to initialize color to a 3-3-2 value, for instance, more work 
must be done. The simplest solution is for host to replicate the desired color value into the two most signif- 
icant pixel fields within HOSTRW format and do the write. This allows flat fill to occur at the full 100Mpix/ 
sec rate. REX3 performs replication into 24b field automatically, so color rounding does not affect the result 
written (X11 issue). An alternate solution is for host to replicate each component into 8b fields of COLORI, 
clearly this is more CPU cycles. [note: in REX1 we simply would put the chip into Cl mode to write sub-24b 
RGB values into framebuffer; in REX3 we have not swizzled the Cl bits the same way as the RGB bits so 
this will not work, unfortunately]. 


0.6 11-10-92 rws 
f line no longer does G test, and has no minor axis adjustment step during setup. 
AWEIGHT ordering across AWEIGHTO&AWEIGHT!1 is simply most to least coverage, nibbles.. 


Endpoint filtering, when enabled by DRAWMODEO«22-», is applied to both endpoints of the line. Use of 

SKIPFIRST or SKIPLAST result in no filtering for the (masked) endpoint. Filtering of the starting vertex is 
done for each “GO” event, so when a line is broken up into segments endpoint filtering should be turned off 
for all butthe first segment, and the ending vertex. Ending vertex filtering is only done for case of major axis 
end value reached. Therefore, unless a line is drawn atomically and/or with SKIPLAST set, filtering of the 
endpoint requires host intervention. (Not only must the endpoint filter be separated out of the loop, drawing 
the endpoint alone is considered as both a start AND end condition, so host must modify the major axis 





August 23, 1993 page147 


SILICON GRAPHICS PROPRIETARY and CONFIDENTIAL 





starting fraction to inhibit it from interfering with REX3 endpoint coverage calculation). Next time we may 
want to have separate startptfilter, endptfilter bits to alleviate this problem. 


0.7 11-11-92 adrian 
Removed the G test. Removed the start point coordinate adjustment based on the E test. 
0.8 11-20-92 rws 


XYOFFSET feature (adds XYMOVE to XYSTARTI during execution) is now only supported for framebuffer 
writes, not reads. (changed to improve timing, assumed not uselul for reads). 


0.9 12-03-92, mod 1-25-93 rws 


Point draw notes: in order to circumvent the rule of no write to XEND with “GO” set, host may simply forego 
writing to XEND, YEND registers. Point is drawn in span (or block) mode with just the XSTART,YSTART 
pair. For DDA data, simply set STOPONX=STOPONY=0; for HOSTRW data source, set COLORHOST 
and/or ALPHAHOST as needed; and also set HOSTPACKED=0. 


0.А 12-08-92 rws 


Setup overhead update: some increase here. Spans/Blocks: 3 clocks; Integer Lines: 5 clocks; Subpixel 
Lines: 15 clocks. The Spans/Blocks increase due to chip timing modifications; the Subpixel Lines increase 
due mainly to E test. (Each increased by one clock for state machine timing issue, also.) 


0.В 01-25-93 rws 


XYWIN hints for GL: since the X,Y coordinates are biased by some number to facilitate simple vertex float- 
to-fixed point conversion, the GL may typically send to REX3 highly biased values. The YFLIP feature ne- 
gates the Y values (YSTART, YEND, YMOVE, SMASKOY) thereby subtracting the result from YWIN to cal- 
culate the Y value on the screen. In order for this to work correctly, the YWIN value must then be biased by 
TWO times the float-to-fixed bias whenever YFLIP is invoked. Internally, REX3 relies on 4K,4K bias for X, Y 
to be relative to screen (upper left) origin. 


0.С 02-11-93 rws 


Antialiased line coverage is a function of host-supplied slope BRESE1. This value must be calculated by 
using coordinate X, Y values which have the 7 Isb's cleared (Isb's of mantissa, assuming float-to-fixed point 
values being written by the GL to XSTARTF, YSTARTF, XENDF, YENDF). This in essence matches REX3 
coordinate snapping to the 4b of subpixel precision, and will yield the best results. 


0.D 06-21-93 rws 


Point draw hints: itis simplest to use adrmode=iline, then there is no need to initialize the octant or to main- 
tain consistent state of endpoint. No setup/dosetup is required; this approach preferred over that of 12-03 
above. 


X11 tiling hints: unfortunately REX3 does not provide support for arbitrary tiling, other than the primitive ap- 
proach of host sending packed pixels and GO for each HOSTRW write. Performance test relies somewhat 
on 4x4 tiles, which REX3 doesn't support in the most efficient way. (an easy thing to add "next time"). 


Revision 0 bugs: 
dma preemption protocol bug is fixed in prototypes via rework, will fix in Rev 1 rex. 
dma resumption protocol bug (unspec'd MC behavior) causes 32b dma mode to fail, will fix Rev 1. 
memory controller bug for R-M-W screen-to-screen moves, state machine fix in Rev 1. 


stipple rotate bug for xby2 case of xstarti=xendi, s/w workaround by init'ing octant; fix Rev 1 
(unfortunately Ironly behavior required host to do the check, fixed Rev 1.) 


octant bit “xdec” was ‘1’ for xstart=xend, changed to ‘0’ to simplify above solution in Rev 1. 
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subpixel bresenham setup Etest unnecessary, deleted in Rev 1. (s/w workaround meanwhile). 
aline coverage term incorrect, algorithm modified to base on sign (ВВЕ551), not S2. fix in Rev 1. 
context switch of colorgreen, colorblue do not preserve sign, must treat same as colorred (s/w fix). 
ystartf write format bug, s/w to use other address/format. fix in Rev 1. 

opcode=read bit not pipelined, affects loop of pio read and writes; fix in Rev 1. 

fastclear doesn't work for TI vrams, only does 4b per vram (not 8). new mode bit in Rev 1. 

lronly span reject for polygons doesn’t autoincrement ystarti for case of block; s/w does spans 


write of register after write of HOSTRW1 sometimes ignored; assume process/fab problem (Isi). 
s/w workaround is to write that register two or three times. 


Revision 1 features added: 
additional register (USER_STATUS) for reading interrupt status without resetting any bits. 
Changed FB24 bit in CONFIG register to FB. TYPE for non-TI/TI column mask-fastclear mode 
ystride bit added to DrawmodeO, to bump YSTARTI by +/-2 at block row end, for video option. 
revision register value changed to '1'. 

Revision 1 bugs: (07-21-93 JEL) 


Preempting a burst read from DCBDATA when the last requested byte is being transferred to REX3 
will result in REX3 continually asserting GRXDLY in response to subsequent reads from DCBDATA. 


The (double)word in which the last requested byte resides will be lost. 


The recovery procedure involves writing to DCBRESET, and re-initiating the burst read from a point 
prior to where the "lost" data may be reacquired. 


This failure mode only has only showed up in DMA reads from the KALEIDOSCOPE VIDEO option 
with an adequate software workaround. 


Itis not possible to resume a preempted burst read from HOSTRW, if a read from DCBDATA occurs 
during the preemption period. Similarly, it is not possible to resume a preempted burst read from 
DCBDATA, if a read from HOSTRW occurs during the preemption period. Therefore, prior to 
performing a read from HOSTRW or DCBDATA, in addition to making sure that the graphics pipe 
orthe backend is not busy, the host should insure that a DMA read from DCBDATA or HOSTRW is 
not in progress. 
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